Compare commits

...

376 Commits

Author SHA1 Message Date
Louis Dureuil
2118dcac7a make CI newer 2023-04-18 19:42:52 +02:00
Louis Dureuil
6939d3d061 More error logs 2023-04-18 19:36:58 +02:00
Louis Dureuil
5d2ca496cb updated tempfile and put some hopeful log 2023-04-18 19:18:15 +02:00
bors[bot]
523fb5cd56 Merge #2084
2084: bump milli r=Kerollmops a=irevoire

- Fix https://github.com/meilisearch/MeiliSearch/issues/2082 by updating milli dependency
- Fix Clippy error
- Change the MeiliSearch version in the cargo.toml to anticipate the coming release (v0.25.2)

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-18 14:30:37 +00:00
Tamo
436f61a7f4 chore: bump meilisearch 2022-01-18 12:27:15 +01:00
Tamo
3fab5869fa chore: bump milli 2022-01-18 11:50:17 +01:00
mpostma
0515c6e844 bug(http): fix task duration 2022-01-13 16:41:07 +01:00
Irevoire
38176181ac fix(dump): Fix the import of dump from the v24 and before 2022-01-13 16:40:58 +01:00
bors[bot]
0ad7d38eec Merge #2061
2061: Update dashboard for v0.25.0 r=curquiza a=mdubus



Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2022-01-10 16:29:31 +00:00
Morgane Dubus
b17ad5c2be Update with latest release of the dashboard 2022-01-10 17:10:09 +01:00
Morgane Dubus
030a90523d Update dashboard for v0.25.0 2022-01-10 10:50:57 +01:00
bors[bot]
56d223a51d Merge #2059
2059: change indexed doc count on error r=irevoire a=MarinPostma

change `indexed_documents` and `deleted_documents` to return 0 instead of null when empty when the task has failed.

close #2053


Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2022-01-06 15:55:50 +00:00
Marin Postma
f558ff826a feat(http): task view indexed and deleted documents return 0 instead of null 2022-01-06 14:55:02 +01:00
bors[bot]
0d2a358cc2 Merge #2056
2056: Allow any header for CORS r=curquiza a=curquiza

Bug fix: trigger a CORS error when trying to send the `User-Agent` header via the browser

`@bidoubiwa` thanks for the bug report!

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-01-05 16:45:51 +00:00
Clémentine Urquizar
595250c93e Allow any header for CORS 2022-01-05 15:38:47 +01:00
bors[bot]
c636988935 Merge #2055
2055: fix(dump): Fix the loading of dump with empty indexes r=irevoire a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-05 14:31:53 +00:00
Tamo
eea483c470 fix(dump): Fix the loading of dump with empty indexes 2022-01-05 15:08:21 +01:00
bors[bot]
d53c61a6d0 Merge #2054
2054: Bug(auth): Wrap key list in results r=irevoire a=ManyTheFish

fix #2052

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-01-04 15:44:55 +00:00
ManyTheFish
c0d4f71a34 Bug(auth): Wrap key list in results 2022-01-04 14:10:30 +01:00
bors[bot]
c0251eb680 Merge #2050
2050: Bug(CORS): Add missing allowed headers r=curquiza a=ManyTheFish

fix #2040

## test
html file to test:

```html
<!DOCTYPE html>
<html>
<meta content="text/html;charset=utf-8" http-equiv="Content-Type">
<meta content="utf-8" http-equiv="encoding">

<script>
var xmlHttp = new XMLHttpRequest();
    xmlHttp.open( "GET", "http://127.0.0.1:7700/indexes/toto", false ); // false for synchronous request
    xmlHttp.setRequestHeader("Authorization", "Bearer manythefish");
    xmlHttp.send( null );
    console.log(xmlHttp.responseText);
</script>
</html>
```


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-01-03 14:23:34 +00:00
ManyTheFish
450b81ca13 Bug(CORS): Add missing allowed headers
fix #2040
2022-01-03 13:41:12 +01:00
bors[bot]
2f3faadcbf Merge #2034
2034: Fix typo r=curquiza a=curquiza

Fix `Meilisearch` typo into `MeiliSearch`

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-01-03 09:40:56 +00:00
bors[bot]
5986a2d126 Merge #2036
2036: chore(ci): Enable rust_backtrace in the ci r=curquiza a=irevoire

This should help us to understand unreproducible panics that happens in the CI all the time

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-12-22 19:00:08 +00:00
Tamo
d75e84f625 chore(ci): Enable rust_backtrace in the ci 2021-12-22 18:20:44 +01:00
bors[bot]
c221277fd2 Merge #2035
2035: Use self hosted GitHub runner r=curquiza a=curquiza

Checked with `@tpayet,` we have created a self hosted github runner to save time when pushing the docker images.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-12-22 15:28:33 +00:00
Clémentine Urquizar
fd854035c1 Use self hosted github runner 2021-12-21 18:32:29 +01:00
bors[bot]
4d1c138842 Merge #2032
2032: Revert docker as non root PR r=curquiza a=ManyTheFish

Revert #1759

hotfix for #1969

Co-authored-by: Maxime Legendre <maximelegendre@mbp-de-maxime.home>
2021-12-21 16:19:36 +00:00
Maxime Legendre
7649239b08 Revert docker as non root PR 2021-12-21 16:59:15 +01:00
bors[bot]
0e2f6ba1b6 Merge #2033
2033: Bug(FS): Consider empty pre-created directory as unexisting DB r=curquiza a=ManyTheFish

When the database directory was pre-created we were considering that DB is invalid, we are now accepting to create a database in it.


Co-authored-by: Maxime Legendre <maximelegendre@mbp-de-maxime.home>
2021-12-21 15:12:07 +00:00
Clémentine Urquizar
f529c46598 Fix typo in error messages and comments 2021-12-21 16:01:38 +01:00
Maxime Legendre
1ba49d2ddb Bug(FS): Consider empty pre-created directory as unexisting DB 2021-12-21 15:30:11 +01:00
bors[bot]
1b5ca88231 Merge #2026
2026: Bug(auth): Parse YMD date r=curquiza a=ManyTheFish

Use NaiveDate to parse YMD date instead of NaiveDatetime

fix #2017


Co-authored-by: Maxime Legendre <maximelegendre@mbp-de-maxime.home>
2021-12-21 13:48:21 +00:00
Maxime Legendre
37329e0784 Bug(auth): Parse YMD date
Use NaiveDate to parse YMD date instead of NaiveDatetime

fix #2017
2021-12-20 15:30:11 +01:00
bors[bot]
eaff393c76 Merge #2025
2025: Fix security index creation r=ManyTheFish a=ManyTheFish

Forbid index creation on alternates routes when the action `index.create` is not given

fix #2024

Co-authored-by: Maxime Legendre <maximelegendre@MacBook-Pro-de-Maxime.local>
2021-12-20 14:04:28 +00:00
Maxime Legendre
a845cd8880 Fix(auth): Forbid index creation on alternates routes
Forbid index creation on alternates routes when the action `index.create` is not given

fix #2024
2021-12-20 14:48:18 +01:00
bors[bot]
845d3114ea Merge #2008
2008: bug(lib): fix get dumps bad error code r=curquiza a=MarinPostma

fix bad error code being returned whet getting a dump status, and add a test
close #1994

Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-12-15 18:58:17 +00:00
bors[bot]
287fa7ca74 Merge #2006 #2007
2006: chore(http): rename task types r=curquiza a=MarinPostma

Rename
- documentsAddition into documentAddition
- documentsPartial into documentPartial
- documentsDeletion into documentDeletion

close #1999


2007: bug(lib): ignore primary if already set on document addition r=curquiza a=MarinPostma

Ignore the primary key if it is already set on documents updates. Add a test for verify behaviour.

close #2002


Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-12-15 16:55:40 +00:00
Marin Postma
80ed9654e1 chore(http): rename task types 2021-12-15 17:01:34 +01:00
Marin Postma
7ddab7ef31 bug(lib): fix get dumps bad error code 2021-12-15 16:58:05 +01:00
Marin Postma
d534a7f7c8 bug(lib): ignore primary if already set on document addition 2021-12-15 14:58:37 +01:00
bors[bot]
5af51c852c Merge #1989
1989: Extend API keys r=curquiza a=ManyTheFish

# Pull Request

## What does this PR do?

- Add API keys in snapshots
- Add API keys in dumps
- fix QA #1979

fix #1979
fix #1995
fix #2001
fix #2003

related to #1890

Co-authored-by: many <maxime@meilisearch.com>
2021-12-14 17:22:58 +00:00
many
ee7970f603 feat(auth): Extend API keys
- Add API keys in snapshots
- Add API keys in dumps
- Rename action indexes.add to indexes.create
- fix QA #1979

fix #1979
fix #1995
fix #2001
fix #2003
related to #1890
2021-12-14 17:33:39 +01:00
bors[bot]
5453877ca7 Merge #1982
1982: Set fail-fast to false in publish-binaries CI r=curquiza a=curquiza

This avoids the other jobs to fail if one of the jobs fails.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-12-14 15:37:26 +00:00
bors[bot]
879cc4ec26 Merge #1984
1984: Support boolean for the no-analytics flag r=Kerollmops a=Kerollmops

This PR fixes an issue with the `no-analytics` flag that was ignoring the value passed to it, therefore a `no-analytics false` was just understood as a `no-analytics` and was effectively disabling the analytics instead of enabling them. I found [a closed issue about this exact behavior on the structopt repository](https://github.com/TeXitoi/structopt/issues/468) and applied it here.

I don't think we should update the documentation as it must have worked like this from the start of this project. I tested it on my machine and it is working great now. Thank you `@nicolasvienot` for this issue report.

Fixes #1983.

Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-12-08 16:41:46 +00:00
Clément Renault
6ac2475aba Fix the no-analytics flag in the tests 2021-12-08 12:02:18 +01:00
Clément Renault
47d5f659e0 Bump the structopt crate to 0.3.25 2021-12-08 11:24:40 +01:00
Clément Renault
8c9e51e94f Make sure that we can also specify the no-analytics flags with a boolean 2021-12-08 11:23:21 +01:00
Clémentine Urquizar
0da5aca9f6 Set fail-fast to false in publish-binaries CI 2021-12-08 09:54:13 +01:00
bors[bot]
9906db9e64 Merge #1978
1978: Fix of `release-v0.25.0` branch into `main` r=curquiza a=curquiza

The fixes in #1976 should be on main to be taken into account by 
```
curl -L https://install.meilisearch.com | sh
```

Co-authored-by: Yann Prono <yann.prono@nist.gov>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
2021-12-07 17:13:38 +00:00
bors[bot]
8096b568f0 Merge #1976 #1977
1976: Fix download-latest.sh r=curquiza a=Mcdostone

# Pull Request

## What does this PR do?
Fixes #1975

The script was broken because `grep` matches the word **draft** in the changelog of [v0.25.0rc0](https://github.com/meilisearch/MeiliSearch/releases/tag/v0.25.0rc0)

> Misc
>    Remove email address from the launch message (#1896) `@curquiza`
>    Remove release drafter workflow (#1882) `@curquiza`                           ⚠️👀


## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to MeiliSearch!
> Your product is awesome!


1977: Fix Dockerfile r=MarinPostma a=curquiza

Remove this error

<img width="830" alt="Capture d’écran 2021-12-07 à 17 00 46" src="https://user-images.githubusercontent.com/20380692/145063294-51ae2c50-2468-47e9-a891-542d824cad8e.png">


Co-authored-by: Yann Prono <yann.prono@nist.gov>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-12-07 16:27:51 +00:00
Clémentine Urquizar
2934a77832 Fix Dockerfile 2021-12-07 17:00:24 +01:00
Yann Prono
cf6cb938a6 update download-latest.sh 2021-12-07 16:37:22 +01:00
Yann Prono
8ff6b1b540 update download-latest.sh 2021-12-07 16:23:30 +01:00
bors[bot]
a938a9ab0f Merge #1974
1974: Update version for the next release (v0.25.0) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-12-07 13:11:24 +00:00
Clémentine Urquizar
ae73386723 Update version for the next release (v0.25.0) 2021-12-07 14:00:43 +01:00
Clémentine Urquizar - curqui
34c8a859eb Merge pull request #1973 from meilisearch/drop-dump-v1
feat(dumps): drop dump V1 support
2021-12-07 14:00:06 +01:00
Marin Postma
23e35fa526 feat(dumps): drop dump V1 support 2021-12-07 10:36:27 +01:00
bors[bot]
82033f935e Merge #1970
1970: Use milli reexported tokenizer  r=curquiza a=ManyTheFish

Use milli reexported tokenizer instead of importing meilisearch-tokenizer dependency.

fix #1888

Co-authored-by: many <maxime@meilisearch.com>
2021-12-07 08:50:44 +00:00
many
ae2b0e7aa7 Use milli reexported tokenizer instead of importing meilisearch-tokenizer dependency 2021-12-06 17:18:28 +01:00
bors[bot]
948615537b Merge #1965
1965: Reintroduce engine version file r=MarinPostma a=irevoire

Right now if you boot up MeiliSearch and point it to a DB directory created with a previous version of MeiliSearch the existing indexes will be deleted. This [used to be](51d7c84e73) prevented by a startup check which would compare the current engine version vs what was stored in the DB directory's version file, but this functionality seems to have been lost after a few refactorings of the code.

In order to go back to the old behavior we'll need to reintroduce the `VERSION` file that used to be present; I considered reusing the `metadata.json` file used in the dumps feature, but this seemed like the simpler and more approach. As the intent is just to restore functionality, the implementation is quite basic. I imagine that in the future we could build on this and do things like compatibility across major/minor versions and even migrating between formats.

This PR was made thanks to `@mbStavola` and is basically a port of his PR #1860 after a big refacto of the code #1796.

Closes #1840

Co-authored-by: Matt Stavola <m.freitas@offensive-security.com>
2021-12-06 13:39:37 +00:00
Matt Stavola
a0e129304c feat(lib): Reintroduce engine version file
Right now if you boot up MeiliSearch and point it to a DB directory created with a previous version of MeiliSearch the existing indexes will be deleted. This used to be prevented by a startup check which would compare the current engine version vs what was stored in the DB directory's version file, but this functionality seems to have been lost after a few refactorings of the code.
In order to go back to the old behavior we'll need to reintroduce the VERSION file that used to be present; I considered reusing the metadata.json file used in the dumps feature, but this seemed like the simpler and more approach. As the intent is just to restore functionality, the implementation is quite basic. I imagine that in the future we could build on this and do things like compatibility across major/minor versions and even migrating between formats.

This PR was made thanks to @mbStavola

Closes #1840
2021-12-06 14:30:56 +01:00
bors[bot]
8d72d538de Merge #1890
1890: Api keys r=MarinPostma a=ManyTheFish

# Pull Request

## API keys management and authorizations
Fixes #1867

## PR checklist

- [x] Test `/keys` routes
- [x] Implement `/keys` routes
- [x] Test API key authorization
- [x] Implement API key authorization
- [x] Rename Authentication header
- [x] default key creation

## Postponed in another PR
- dumps  (waiting for #1796)
- snapshot  (waiting for #1796)




Co-authored-by: many <maxime@meilisearch.com>
2021-12-06 13:23:23 +00:00
many
ffefd0caf2 feat(auth): API keys
implements:
https://github.com/meilisearch/specifications/blob/develop/text/0085-api-keys.md

- Add tests on API keys management route (meilisearch-http/tests/auth/api_keys.rs)
- Add tests checking authorizations on each meilisearch routes (meilisearch-http/tests/auth/authorization.rs)
- Implement API keys management routes (meilisearch-http/src/routes/api_key.rs)
- Create module to manage API keys and authorizations (meilisearch-auth)
- Reimplement GuardedData to extend authorizations (meilisearch-http/src/extractors/authentication/mod.rs)
- Change X-MEILI-API-KEY by Authorization Bearer (meilisearch-http/src/extractors/authentication/mod.rs)
- Change meilisearch routes to fit to the new authorization feature (meilisearch-http/src/routes/)

- close #1867
2021-12-06 09:52:41 +01:00
bors[bot]
fa196986c2 Merge #1796
1796: Feature branch: Task store r=irevoire a=MarinPostma

# Feature branch: Task Store

## Spec todo
https://github.com/meilisearch/specifications/blob/develop/text/0060-refashion-updates-apis.md

- [x] The update resource is renamed task. The names of existing API routes are also changed to reflect this change.
- [x]    Tasks are now also accessible as an independent resource of an index. GET - /tasks; GET - /tasks/:taskUid
- [x] The task uid is not incremented by index anymore. The sequence is generated globally.
- [x] A task_not_found error is introduced.
- [x] The format of the task object is updated.
- [x] updateId becomes uid.
- [x] Attributes of an error appearing in a failed task are now contained in a dedicated error object.
- [x] type is no longer an object. It now becomes a string containing the values of its name field previously defined in the type object.
- [x] The possible values for the type field are reworked to be more clear and consistent with our naming rules.
- [x] A details object is added to contain specific information related to a task payload that was previously displayed in the type nested object. Previous number key is renamed numberOfDocuments.
- [x] An indexUid field is added to give information about the related index on which the task is performed.
- [x] duration format has been updated to express an ISO 8601 duration.
- [x] processed status changes to succeeded.
- [x] startedProcessingAt is updated to startedAt.
- [x] processedAt is updated to finishedAt.
- [x] 202 Accepted requests previously returning an updateId are now returning a summarized task object.
- [x] MEILI_MAX_UDB_SIZE env var is updated MEILI_MAX_TASK_DB_SIZE.
- [x] --max-udb-size cli option is updated to --max-task-db-size.
- [x] task object lists are now returned under a results array.
- [x] Each operation on an index (creation, update, deletion) is now asynchronous and represented by a task.

## Todo tech

- [x] Restore Snapshots
- [x] Restore dumps of documents
- [x] Implements the dump of updates
- [x] Error  handling
- [x] Fix stats
- [x] Restore the Analytics
- [x] [Add the new analytics](https://github.com/meilisearch/specifications/pull/92/files)
- [x] Fix tests
- [x] ~Deleting tasks when index is deleted (see bellow)~ see #1891 instead
- [x] Improve details for documents addition and deletion tasks
- [ ] Add integration test
- [ ] Test task store filtering
- [x] Rename `UuidStore` to `IndexMetaStore`, and simplify the trait.
- [x] Fix task store initialization: fill pending queue from hard state
- [x] Synchronously return error when creating an index with an invalid index_uid and add test
- [x] Task should be returned in decreasing uid + tests (on index task route)
- [x] Summarized task view
- [x] fix snapshot permissions


## Implementation

### Linked PRs
- #1889
- #1891
- #1892
- #1902
- #1906
- #1911
- #1914
- #1915
- #1916
- #1918
- #1924
- #1925
- #1926
- #1930
- #1936
- #1937
- #1942
- #1944
- #1945
- #1946
- #1947
- #1950
- #1951
- #1957
- #1959
- #1960
- #1961
- #1962
- #1964

### Linked PRs in milli:
- https://github.com/meilisearch/milli/pull/414
- https://github.com/meilisearch/milli/pull/409
- https://github.com/meilisearch/milli/pull/406
- https://github.com/meilisearch/milli/pull/418

### Issues
- close #1687
- close #1786
- close #1940
- close #1948
- close #1949
- close #1932
- close #1956

### Spec patches
- https://github.com/meilisearch/specifications/pull/90


Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-12-03 11:36:53 +00:00
Marin Postma
a30e02c18c feat(all): Task store
implements:
https://github.com/meilisearch/specifications/blob/develop/text/0060-refashion-updates-apis.md

linked PR:

- #1889
- #1891
- #1892
- #1902
- #1906
- #1911
- #1914
- #1915
- #1916
- #1918
- #1924
- #1925
- #1926
- #1930
- #1936
- #1937
- #1942
- #1944
- #1945
- #1946
- #1947
- #1950
- #1951
- #1957
- #1959
- #1960
- #1961
- #1962
- #1964

- https://github.com/meilisearch/milli/pull/414
- https://github.com/meilisearch/milli/pull/409
- https://github.com/meilisearch/milli/pull/406
- https://github.com/meilisearch/milli/pull/418

- close #1687
- close #1786
- close #1940
- close #1948
- close #1949
- close #1932
- close #1956
2021-12-02 20:14:42 +01:00
bors[bot]
c9f3726447 Merge #1893
1893: Make matches work with numerical value r=MarinPostma a=Thearas

# Pull Request

## What does this PR do?

Implement #1883.

I have test this PR with unit test. It appears to be working properly:
![image](https://user-images.githubusercontent.com/44015907/141141082-dad8cd18-e803-408f-ad6a-c7a212b7ec88.png)

PTAL `@curquiza` 

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?


Co-authored-by: Thearas <thearas850@gmail.com>
2021-11-29 13:38:57 +00:00
bors[bot]
8363200fd7 Merge #1910
1910: After v0.24.0: import `stable` in `main` r=MarinPostma a=curquiza



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-17 12:48:56 +00:00
bors[bot]
37548eb720 Merge #1896
1896: Remove email address from the message at the launch r=irevoire a=curquiza

I suggest removing this email address from the message at the launch since it can encourage people to think this is an email address for support. Is it something we want `@meilisearch/devrel-team` since we mostly redirect them to the forum or the slack?

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-16 16:29:05 +00:00
bors[bot]
0a1d2ce231 Merge #1904
1904: Update mini-dashboard version to v0.1.5 r=curquiza a=curquiza

Update the mini-dashboard with its latest version (v0.1.5)

Check with `@mdubus,` replaces https://github.com/meilisearch/MeiliSearch/pull/1903

Fixes #1898 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-15 18:02:07 +00:00
Clémentine Urquizar
9d01c5d882 Update mini-dashboard version to v0.1.5 2021-11-15 18:54:55 +01:00
bors[bot]
53fc2edab3 Merge #1897
1897: Add ARM image for Docker to CI r=irevoire a=curquiza

Fixes #1315 

- [x] Publish MeiliSearch's docker image for `arm64`
- [x] Add `workflow_dispatch` event in case we need to re-trigger it after a failure without creating a new release
- [x] Use our own server to run the github runner since this CI is really slow (1h instead of 4h)
- [x] Open an issue for a refactor by merging both files in one file (https://github.com/meilisearch/MeiliSearch/issues/1901)

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-15 16:16:34 +00:00
Clémentine Urquizar
b7c5b78a61 Remove workflow_dispatch 2021-11-15 14:13:40 +01:00
Clémentine Urquizar
f081dc2001 Remove checkout 2021-11-12 21:31:18 +01:00
Clémentine Urquizar
2cf7daa227 Use self-hosted runner 2021-11-12 18:49:02 +01:00
Clémentine Urquizar
9d75fbc619 Fix docker meta job 2021-11-11 19:42:30 +01:00
Clémentine Urquizar
3b1b9a277b Remove context 2021-11-11 16:38:45 +01:00
Clémentine Urquizar
40e87b9544 Add checkout 2021-11-11 16:37:01 +01:00
Clémentine Urquizar
ded7922be5 Add context and platform 2021-11-11 16:30:16 +01:00
Clémentine Urquizar
11ef64ee43 Fix credentials 2021-11-11 16:02:32 +01:00
Clémentine Urquizar
5e6d7b7649 Add worflow dispatch event 2021-11-11 16:00:10 +01:00
Clémentine Urquizar
5fd9616b5f Add ARM image for Docker to CI 2021-11-11 15:58:44 +01:00
Clémentine Urquizar
a1227648ba Remove email address from the message at the launch 2021-11-11 14:36:45 +01:00
bors[bot]
919f4173cf Merge #1895
1895: Fix aggregated search events name r=irevoire a=gmourier



Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
2021-11-11 09:35:34 +00:00
Guillaume Mourier
7c5aad4073 fix aggregated search event names 2021-11-11 01:38:10 +01:00
bors[bot]
d47ccd9199 Merge #1894
1894: Fix the 99th percentile in the analytics r=gmourier a=irevoire



Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-11-10 17:27:39 +00:00
Irevoire
cc5e884b34 fix the 99th percentile in the analytics 2021-11-10 18:26:38 +01:00
Thearas
ac5535055f Make matches work with numerical value 2021-11-10 23:10:30 +08:00
bors[bot]
15cb4dafa9 Merge #1882
1882: Remove release drafter r=curquiza a=curquiza

Remove release drafter since it's not used at the moment due to the specific release process of MeiliSearch.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-09 14:36:32 +00:00
bors[bot]
8ca76d9fdf Merge #1884
1884: Remove Hacktoberfest section from CONTRIBUTING.md r=curquiza a=meili-bot

_This PR is auto-generated._

Remove Hacktoberfest section from CONTRIBUTING.md


Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2021-11-09 13:19:30 +00:00
meili-bot
f62e52ec68 Update CONTRIBUTING.md 2021-11-09 14:15:50 +01:00
Clémentine Urquizar
bf01c674ea Remove release drafter 2021-11-08 14:49:50 +01:00
bors[bot]
e9b6a05b75 Merge #1878
1878: Add error object in task r=MarinPostma a=ManyTheFish

# Pull Request

## What does this PR do?
Fixes #1877

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Update error test
- [x] Remove flattening of errors during task serialization


Co-authored-by: many <maxime@meilisearch.com>
2021-11-04 17:16:30 +00:00
many
6bbc1b4316 Remove error flattening in task serialization 2021-11-04 17:40:28 +01:00
many
3c696da274 Update tests 2021-11-04 17:40:28 +01:00
bors[bot]
d9d6dee550 Merge #1873
1873: Change lacking errors r=ManyTheFish a=ManyTheFish



Co-authored-by: many <maxime@meilisearch.com>
2021-11-04 14:21:52 +00:00
many
cc6306c0e1 Update milli version 2021-11-04 14:57:45 +01:00
many
b59145385e Fix PR comments 2021-11-04 14:57:27 +01:00
bors[bot]
3f4e0ec971 Merge #1875 #1876
1875: Fix search post event and disk size analytics r=irevoire a=gmourier

- Branch POST search on the post_search aggregator
- Use largest disk `total_space` instead of `available_space` 

1876: Update SEGMENT_API_KEY r=irevoire a=gmourier

Branch it on our Segment production stack

Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
2021-11-04 10:16:13 +00:00
bors[bot]
ec0716ddd1 Merge #1874
1874: Fix small typo in download-latest.sh r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2021-11-04 09:50:55 +00:00
Guillaume Mourier
6d6725b3b8 Update SEGMENT_API_KEY 2021-11-04 08:10:12 +01:00
Guillaume Mourier
6660be2cb7 Branch POST /search on the dedicated analytics aggregator 2021-11-04 08:03:48 +01:00
Guillaume Mourier
847fcb570b Use total_space of the largest disk instead of available_space 2021-11-04 08:03:11 +01:00
bors[bot]
4095ec462e Merge #1865
1865: Aggregate the search even when it fails fail r=MarinPostma a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2021-11-03 16:49:05 +00:00
Clémentine Urquizar - curqui
f7f2421e71 Update download-latest.sh 2021-11-03 17:09:50 +01:00
many
b664a46e91 Update milli version 2021-11-03 16:11:20 +01:00
many
06e6eaa7b4 Remove useless Facet variant 2021-11-03 16:11:09 +01:00
many
30a094cbb2 Change lacking errors 2021-11-03 14:33:33 +01:00
Tamo
904bae98f8 send the analytics even when the search fail 2021-11-02 12:38:01 +01:00
bors[bot]
c32f13a909 Merge #1800
1800: Analytics r=irevoire a=irevoire

Closes #1784
Implements [this spec](https://github.com/meilisearch/specifications/blob/update-analytics-specs/text/0034-telemetry-policies.md) 

# Anonymous Analytics Policy

## 1. Functional Specification

### I. Summary

This specification describes an exhaustive list of anonymous metrics collected by the MeiliSearch binary. It also describes the tools we use for this collection and how we identify a Meilisearch instance.

### II. Motivation

At MeiliSearch, our vision is to provide an easy-to-use search solution that meets the essential needs of our users. At all times, we strive to understand our users better and meet their expectations in the best possible way.

Although we can gather needs and understand our users through several channels such as Github, Slack, surveys, interviews or roadmap votes, we realize that this is not enough to have a complete view of MeiliSearch usage and features adoption. By cross-referencing our product discovery phases with aggregated quantitative data, we want to make the product much better than what it is today. Our decision-making will be taken a step further to make a product that users love.

### III. Explanation

#### General Data Protection Regulation (GDPR)

The metrics collected are non-sensitive, non-personal and do not identify an individual or a group of individuals using MeiliSearch. The data collected is secured and anonymized. We do not collect any data from the values stored in the documents.

We, the MeiliSearch team, provide an email address so that users can request the removal of their data: privacy@meilisearch.com.<br>
Thanks to the unique identifier generated for their MeiliSearch installation (`Instance uuid` when launching MeiliSearch), we can remove the corresponding data from all the tools we describe below. Any questions regarding the management of the data collected can be sent to the email address as well.

#### Tools

##### Segment

The collected data is sent to [Segment](https://segment.com/). Segment is a platform for data collection and provides data management tools.

##### Amplitude

[Amplitude](https://amplitude.com/) is a tool for graphing and highlighting collected data. Segment feeds Amplitude so that we can build visualizations according to our needs.

-----------
# The `identify` call we send every hour:

## System Configuration `system`

This property allows us to gather essential information to better understand on which type of machine MeiliSearch is used. This allows us to better advise users on the machines to choose according to their data volume and their use-cases.

 - [x] `system` => Never changes but still sent every hours
     - [x] distribution | On which distribution MeiliSearch is launched, eg: Arch Linux
     - [x] kernel_version | On which kernel version MeiliSearch is launched, eg: 5.14.10-arch1-1
     - [x] cores | How many cores does the machine have, eg: 24
     - [x] ram_size | Total capacity of the machine's ram. Expressed in `Kb`, eg: 33604210
     - [x] disk_size | Total capacity of the biggest disk. Expressed in `Kb`, eg: 336042103
     - [x] server_provider | Users can tell us on which provider MeiliSearch is hosted by filling the `MEILI_SERVER_PROVIDER` env var. This is also filled by our providers deploy scripts. e.g. GCP [cloud-config.yaml](56a7c2630c/scripts/providers/gcp/cloud-config.yaml (L33)), eg: gcp

## MeiliSearch Configuration

- [x] `context.app.version`: MeiliSearch version, eg: 0.23.0
- [x] `env`: `production` / `development`, eg: `production`
- [x] `has_snapshot`: Does the MeiliSearch instance has snapshot activated, eg: `true`

## MeiliSearch Statistics `stats`

 - [x] `stats`
     - [x] `database_size`: Size of indexed data. Expressed in `Kb`, eg: 180230
     - [x] `indexes_number`: Number of indexes, eg: 2
     - [x] `documents_number`: Number of indexed documents, eg: 165847
     - [x] `start_since_days`: How many days ago was the instance launched?, eg: 328

---------

- [x] Launched | This is the first event sent to mark that MeiliSearch is launched a first time

---------

- [x] `Documents Searched POST`: The Documents Searched event is sent once an hour. The event's properties are averaged over all search operations during that time so as not to track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents all the user-agents encountered on this endpoint during one hour, eg: `["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]`
  - [x] `requests`
      - [x] `99th_response_time`: The maximum latency, in ms, for the fastest 99% of requests, eg: `57ms`
      - [x] `total_suceeded`: The total number of succeeded search requests, eg: `3456`
      - [x] `total_failed`: The total number of failed search requests, eg: `24`
      - [x] `total_received`: The total number of received search requests, eg: `3480`
  - [x] `sort`
      - [x] `with_geoPoint`: Does the built-in sort rule _geoPoint rule has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of sort criteria among all the requests containing the sort parameter. "sort": [] equals to 0 while not sending sort does not influence the average, eg: `2`
  - [x] `filter`
      - [x] `with_geoRadius`: Does the built-in filter rule _geoRadius has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of filter criteria among all the requests containing the filter parameter. "filter": [] equals to 0 while not sending filter does not influence the average, eg: `4`
      - [x] `most_used_syntax`: The most used filter syntax among all the requests containing the requests containing the filter parameter. `string` / `array` / `mixed`, `mixed`
  - [x] `q`
      - [x] `avg_terms_number`: The average number of terms for the `q` parameter among all requests, eg: `5`
  - [x] `pagination`:
      - [x] `max_limit`: The maximum limit encountered among all requests, eg: `20`
      - [x] `max_offset`: The maxium offset encountered among all requests, eg: `1000` 

---

- [x] `Documents Searched GET`: The Documents Searched event is sent once an hour. The event's properties are averaged over all search operations during that time so as not to track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents all the user-agents encountered on this endpoint during one hour, eg: `["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]`
  - [x] `requests`
      - [x] `99th_response_time`: The maximum latency, in ms, for the fastest 99% of requests, eg: `57ms`
      - [x] `total_suceeded`: The total number of succeeded search requests, eg: `3456`
      - [x] `total_failed`: The total number of failed search requests, eg: `24`
      - [x] `total_received`: The total number of received search requests, eg: `3480`
  - [x] `sort`
      - [x] `with_geoPoint`: Does the built-in sort rule _geoPoint rule has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of sort criteria among all the requests containing the sort parameter. "sort": [] equals to 0 while not sending sort does not influence the average, eg: `2`
  - [x] `filter`
      - [x] `with_geoRadius`: Does the built-in filter rule _geoRadius has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of filter criteria among all the requests containing the filter parameter. "filter": [] equals to 0 while not sending filter does not influence the average, eg: `4`
      - [x] `most_used_syntax`: The most used filter syntax among all the requests containing the requests containing the filter parameter. `string` / `array` / `mixed`, `mixed`
  - [x] `q`
      - [x] `avg_terms_number`: The average number of terms for the `q` parameter among all requests, eg: `5`
  - [x] `pagination`:
      - [x] `max_limit`: The maximum limit encountered among all requests, eg: `20`
      - [x] `max_offset`: The maxium offset encountered among all requests, eg: `1000` 

---

- [x] `Index Created`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`

---

- [x] `Index Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`

---

- [x] `Documents Added`: The Documents Added event is sent once an hour. The event's properties are averaged over all POST /documents additions operations during that time to not track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `payload_type`: Represents all the `payload_type` encountered on this endpoint during one hour, eg: [`text/csv`]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`
  - [x] `index_creation`: Does an index creation happened, eg: `false`

---

- [x] `Documents Updated`: The Documents Added event is sent once an hour. The event's properties are averaged over all PUT /documents additions operations during that time to not track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `payload_type`: Represents all the `payload_type` encountered on this endpoint during one hour, eg: [`application/json`]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`
  - [x] `index_creation`: Does an index creation happened, eg: `false`

---

- [x] Settings Updated
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `ranking_rules`
      - [x] `sort_position`: Position of the `sort` ranking rule if any, otherwise `null`, eg: `5`
  - [x] `sortable_attributes`
      - [x] `total`: Number of sortable attributes, eg: `3`
      - [x] `has_geo`: Indicate if `_geo` is set as a sortable attribute, eg: `false`
  - [x] `filterable_attributes`
      - [x] `total`: Number of filterable attributes, eg: `3`
      - [x] `has_geo`: Indicate if `_geo` is set as a filterable attribute, eg: `false`

---

- [x] `RankingRules Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `sort_position`: Position of the `sort` ranking rule if any, otherwise `null`, eg: `5`

---

- [x] `SortableAttributes Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `total`: Number of sortable attributes, eg: `3`
  - [x] `has_geo`: Indicate if `_geo` is set as a sortable attribute, eg: `false`

---

- [x] `FilterableAttributes Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `total`: Number of filterable attributes, eg: `3`
  - [x] `has_geo`: Indicate if `_geo` is set as a filterable attribute, eg: `false`

---

- [x] Dump Created
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]

---

Ensure the user-id file is well saved and loaded with:
- [x] the dumps
- [x]  the snapshots



- [x] Ensure the CLI uuid only show if analytics are activate at launch **or already exists** (=even if meilisearch was launched without analytics)

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-10-29 16:11:03 +00:00
marin postma
519093ea65 fix bad rebase 2021-10-29 17:32:49 +02:00
Tamo
bd49d1c4b5 fix one small bug 2021-10-29 17:25:56 +02:00
marin postma
2665c0099d clippy + fmt 2021-10-29 17:25:56 +02:00
marin postma
d65f055030 pass anaytics into Arc instead of static ref 2021-10-29 17:25:55 +02:00
Tamo
66d87761b7 align the parameters in the launche resume 2021-10-29 17:25:55 +02:00
Tamo
ba69ad672a fix the timing issue 2021-10-29 17:25:55 +02:00
Tamo
7934e3956b replace all mutexes by channel 2021-10-29 17:25:55 +02:00
Guillaume Mourier
68fe93b7db add ranking_rules marker before sort_position 2021-10-29 17:25:55 +02:00
Tamo
efd0ea9e1e makes clippy happier 2021-10-29 17:25:55 +02:00
Tamo
6ef73eb226 fix all the single settings route and add the searchable attributes Updated event 2021-10-29 17:25:55 +02:00
Tamo
fc2f23d36c move the start_since_days to teh root of the identify 2021-10-29 17:25:54 +02:00
Tamo
7c39fab453 move the user-agent out of the context in every request 2021-10-29 17:25:54 +02:00
Tamo
c5164c01c0 set the total of sortable attributes and filterable-attributes to 0 when not set 2021-10-29 17:25:54 +02:00
Tamo
351ad32d77 fix the index_creation boolean 2021-10-29 17:25:54 +02:00
Tamo
3ad8311bdd split the analytics in a module 2021-10-29 17:25:54 +02:00
Tamo
ea5ae2bae5 sort the imports 2021-10-29 17:25:54 +02:00
Tamo
72e3adc55e display an instance-id instead of a user-id 2021-10-29 17:25:54 +02:00
Tamo
b250392e8d remove the first - in the path to the db instance in the instance-id 2021-10-29 17:25:53 +02:00
Tamo
d8b0d68840 use a regex to count the number of filters instead of split + flatten 2021-10-29 17:25:53 +02:00
Tamo
c4737749ab bump segment to be able to display a user 2021-10-29 17:25:53 +02:00
Tamo
a1ab02f9fb remove some commented code 2021-10-29 17:25:53 +02:00
Tamo
bba64b32ca async_traits is not needed anymore 2021-10-29 17:25:53 +02:00
Tamo
9abd2aa9d7 make the analytics interval a const 2021-10-29 17:25:53 +02:00
Tamo
de35a9a605 use an official release of segment 2021-10-29 17:25:53 +02:00
Tamo
ed750e8792 fix start_since_day 2021-10-29 17:25:53 +02:00
Tamo
37ca50832c fix the sort position 2021-10-29 17:25:52 +02:00
Tamo
31c7a0105b fix a bug on the batch documents function 2021-10-29 17:25:52 +02:00
Tamo
ddab9eafa1 fix a typo 2021-10-29 17:25:52 +02:00
Tamo
76a4f86e0c rename user-id to instance-uid 2021-10-29 17:25:52 +02:00
Tamo
6b34318274 makes clippy happy 2021-10-29 17:25:52 +02:00
Tamo
5508c6c154 a bit of styling 2021-10-29 17:25:52 +02:00
Tamo
9a62ac0c94 send the analytics only once every hours 2021-10-29 17:25:52 +02:00
Tamo
01737ef847 remove all the debug prints 2021-10-29 17:25:51 +02:00
Tamo
3144b572c4 remove the debug mode in release 2021-10-29 17:25:51 +02:00
Tamo
10de92987a compile write_user_id only when the analytics are enabled 2021-10-29 17:25:51 +02:00
Tamo
c752c14c46 refactorize the dump and snapshot 2021-10-29 17:25:51 +02:00
Tamo
87a8bf5e96 write and load the user-id in the dumps 2021-10-29 17:25:51 +02:00
Tamo
ba14ea1243 plug the new batchers into the documents route 2021-10-29 17:25:51 +02:00
Tamo
9be90011c6 save the user-id in the config dir of the OS 2021-10-29 17:25:51 +02:00
Tamo
f9b14ca149 simplify the search batcher 2021-10-29 17:25:50 +02:00
Tamo
6591acfdfa rename the documents batchers 2021-10-29 17:25:50 +02:00
Tamo
e64ba122e1 factorize the code between the two documents batcher 2021-10-29 17:25:50 +02:00
Tamo
a9523146a3 simplify the into_events methods 2021-10-29 17:25:50 +02:00
Tamo
392ee86714 implement the documents batcher 2021-10-29 17:25:50 +02:00
Tamo
1d73f484f0 update the primary key when creating a new index 2021-10-29 17:25:50 +02:00
Tamo
cfcd3ae048 move the version to context.app 2021-10-29 17:25:50 +02:00
Tamo
5395041dcb fix the stats and stop sending events when no request happened 2021-10-29 17:25:49 +02:00
Tamo
40eabd50d1 integrate the search batcher in the search route 2021-10-29 17:25:49 +02:00
Tamo
35ffd0ec3a integrate the search batcher in the tick 2021-10-29 17:25:49 +02:00
Tamo
d3d76bf97a wip create a search batcher 2021-10-29 17:25:49 +02:00
Tamo
595ae42e94 update the name of the Launched event 2021-10-29 17:25:49 +02:00
Tamo
0667d940f9 update the name of nb_cores in the identify 2021-10-29 17:25:49 +02:00
Irevoire
75d1272325 log the dump creation 2021-10-29 17:25:49 +02:00
Irevoire
8e2d6cf87d add the content type to all the route 2021-10-29 17:25:48 +02:00
Irevoire
9e1bba40f7 do not print anything if no user id was found 2021-10-29 17:25:48 +02:00
Irevoire
f7bb499c28 send the first identify + launched for the first time events right away instead of batching them 2021-10-29 17:25:48 +02:00
Irevoire
b33b1ef3dd update the way of getting and saving the user-id to the file system 2021-10-29 17:25:48 +02:00
Irevoire
30aeda7a1a update the identify call to the latest spec version 2021-10-29 17:25:48 +02:00
Irevoire
22d9d660cc log all the required settings route 2021-10-29 17:25:48 +02:00
Irevoire
7524bfc07f log the all settings updated route 2021-10-29 17:25:48 +02:00
Tamo
bda7472880 log the documetns updated route 2021-10-29 17:25:48 +02:00
Tamo
1ed05c6c07 log documents added 2021-10-29 17:25:47 +02:00
Tamo
0b3e0a59cb log index updated 2021-10-29 17:25:47 +02:00
Tamo
0616f68eb0 implements part of the search 2021-10-29 17:25:47 +02:00
Tamo
6b8e5a4c92 log the index created route 2021-10-29 17:25:47 +02:00
Tamo
d72c887422 makes the analytics available for all the routes 2021-10-29 17:25:47 +02:00
Tamo
664d09e86a makes the analytics works with the option and the feature 2021-10-29 17:25:47 +02:00
Tamo
e226b1a87f rewrite the main analytics module and the information sent in the tick 2021-10-29 17:25:42 +02:00
bors[bot]
b227666271 Merge #1855
1855: Change `authentication` error type to be  `auth` r=curquiza a=gmourier



Co-authored-by: many <maxime@meilisearch.com>
2021-10-28 15:29:21 +00:00
many
6fea050813 Change authentication error type to be 2021-10-28 16:57:48 +02:00
bors[bot]
cf67964133 Merge #1848
1848: Error format and Definition r=MarinPostma a=ManyTheFish



Co-authored-by: many <maxime@meilisearch.com>
2021-10-28 14:15:35 +00:00
bors[bot]
f8d04b11d5 Merge #1854
1854: Update version for the next release (v0.24.0) r=MarinPostma a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-28 14:03:55 +00:00
many
3a29cbf0ae Use milli v0.20.0 2021-10-28 15:59:06 +02:00
many
66f5de9703 Change missing authrization code 2021-10-28 15:56:57 +02:00
many
cbaca2b579 Fix PR comments 2021-10-28 15:42:42 +02:00
Clémentine Urquizar
a76d9b15c9 Update version for the next release (v0.24.0) 2021-10-28 12:24:49 +02:00
many
59636fa688 Pimp error where no document is provided 2021-10-28 12:13:51 +02:00
many
ff0908d3fa Ignore errors tests that show unrelated bugs 2021-10-28 11:41:59 +02:00
many
21f35762ca Fix content type test 2021-10-28 10:57:11 +02:00
many
7464720426 Fix some errors 2021-10-28 10:47:59 +02:00
bors[bot]
6e57c40c37 Merge #1853
1853: Create SECURITY.md r=curquiza a=CaroFG



Co-authored-by: CaroFG <48251481+CaroFG@users.noreply.github.com>
2021-10-27 15:54:48 +00:00
CaroFG
c8518f4ab2 Create SECURITY.md 2021-10-27 17:49:00 +02:00
bors[bot]
b9c061ab3d Merge #1852
1852: Add tests for mini-dashboard status and assets r=curquiza a=CuriousCorrelation

## Summery

Added tests for `mini-dashboard` status including assets.

## Ticket link

PR closes  #1767

Co-authored-by: CuriousCorrelation <CuriousCorrelation@protonmail.com>
2021-10-27 13:26:42 +00:00
bors[bot]
d905bbf961 Merge #1787
1787: Handle empty dump r=MarinPostma a=irevoire

Fixes #1701

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-10-27 12:47:45 +00:00
CuriousCorrelation
6641e7aa50 Add tests for mini-dashboard status and assets 2021-10-27 17:57:25 +05:30
many
61c15b69fb Change malformed_payload error 2021-10-27 11:13:12 +02:00
many
8ec0c4c913 Add bad_request error tests 2021-10-27 11:13:12 +02:00
bors[bot]
0a9d6e8210 Merge #1847
1847: Optimize document transform r=MarinPostma a=MarinPostma

integrate the optimization from https://github.com/meilisearch/milli/pull/402.

optimize payload read, by reading it to RAM first instead of streaming it. This means that the payload must fit into RAM, which should not be a problem.

Add BufWriter to the obkv writer to improve write speed.

I have measured a gain of 40-45% in speed after these optimizations.


Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 15:28:58 +00:00
bors[bot]
0ed800b612 Merge #1830
1830: Add MEILI_SERVER_PROVIDER to Dockerfile r=irevoire a=curquiza

Add docker information in `MEILI_SERVER_PROVIDER` env variable

It does not impact the telemetry spec since it's an already existing variable used on our side.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-26 14:01:06 +00:00
marin postma
4ac005b094 optimize document transform
fix error types

bump milli
2021-10-26 13:51:15 +02:00
Tamo
5e3a53b576 fix a bug in the generation of empty dumps 2021-10-25 14:17:57 +02:00
bors[bot]
e87146b0d9 Merge #1811
1811: Reducing ArmV8 binary build time with action-rs (cross build with Rust) r=curquiza a=patrickdung

This pull request is based on [discussion #1790](https://github.com/meilisearch/MeiliSearch/discussions/1790)

Note:
1) The binaries of this PR is additional to existing binary built
Existing binary would be produced (by existing GitHub workflow/action)

meilisearch-linux-amd64
meilisearch-linux-armv8
meilisearch-macos-amd64
meilisearch-windows-amd64.exe
meilisearch.deb

2) This PR produce these binaries. The name 'meilisearch-linux-aarch64' is used to avoid naming conflict with 'meilisearch-linux-armv8'.

meilisearch-linux-aarch64
meilisearch-linux-aarch64-musl
meilisearch-linux-aarch64-stripped
meilisearch-linux-amd64-musl

3) If it's fine (in next release), we should submit another PR to stop generating meilisearch-linux-armv8 (which could take two to three hours to build it)

Co-authored-by: Patrick Dung <38665827+patrickdung@users.noreply.github.com>
2021-10-24 12:24:39 +00:00
Patrick Dung
5caa79df67 Update .github/workflows/publish-crossbuild.yml
Update to use the correct syntax

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-24 11:31:04 +00:00
Patrick Dung
d519e1036f Update .github/workflows/publish-crossbuild.yml
better naming

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-21 17:02:28 +00:00
Patrick Dung
19eebc0b0a Update .github/workflows/publish-crossbuild.yml
better naming

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-21 17:02:16 +00:00
Patrick Dung
5585020753 Update .github/workflows/publish-crossbuild.yml
better spacing

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-21 17:02:00 +00:00
Patrick Dung
ef7e7a8f11 Only generate aarch64 binary with action-rs 2021-10-21 00:40:46 +08:00
Clémentine Urquizar
eb91f27b65 Add MEILI_SERVER_PROVIDER to dockerfile 2021-10-20 17:53:43 +02:00
bors[bot]
24eef577c5 Merge #1822
1822: Tiny improvements in download-latest.sh r=irevoire a=curquiza

- Add check on `$latest` to check if it's empty. We have some issue on the swift SDK currently where the version number seems not to be retrieved, but we don't why https://github.com/meilisearch/meilisearch-swift/pull/216
- Replace some `"` by `'`
- Rename `$BINARY_NAME` by `$binary_name` to make them consistent with the other variables that are filled all along the script

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-19 09:32:27 +00:00
Clémentine Urquizar
e7e4ccf74f Merge pull request #1817 from nfsec/patch-1
Improve RUNs in Dockerfile
2021-10-19 00:00:22 +02:00
Clémentine Urquizar
017ecf76e3 Replace double quotes by single ones 2021-10-18 23:39:07 +02:00
bors[bot]
1c9ceadd8d Merge #1824
1824: Fix indexation perfomances on mounted disk r=ManyTheFish a=ManyTheFish

We were creating all of our tempfiles in data.ms directory, but when the database directory is stored in a mounted disk, tempfiles I/O throughput decreases, impacting the indexation time.

Now, only the persisting tempfiles will be created in the database directory. Other tempfiles will stay in the default tmpdir.

Co-authored-by: many <maxime@meilisearch.com>
2021-10-18 12:42:47 +00:00
many
36ab7b3ebd Fix small typo 2021-10-18 14:17:32 +02:00
many
b4038597ba Keep persisting tmp files in database directory and put non-persisting tmp files in default tmp dir 2021-10-18 14:16:35 +02:00
bors[bot]
79817bd465 Merge #1813
1813: Apply highlight tags on numbers in the formatted search result output r=irevoire a=Jhnbrn90

This is my first ever Rust related PR. 

As described in #1758, I've attempted to highlighting numbers correctly under the `_formatted` key.

Additionally, I added a test which should assert appropriate highlighting. 

I'm open to suggestions and improvements. 


Co-authored-by: John Braun <john@brn.email>
2021-10-18 09:05:01 +00:00
Clémentine Urquizar
93ad8f04b5 Add check if $latest is empty 2021-10-16 17:36:36 +02:00
Clémentine Urquizar
e4cb7ed30f Tiny improvement in download-latest.sh 2021-10-16 17:23:50 +02:00
bors[bot]
b9e060423f Merge #1760
1760: Add option to use environment variable to increase rate limit r=curquiza a=nav1s

This closes #1655.

Added GITHUB_PAT environment variable and a comment to explain how to create it (I found the ```public_repo``` scope to be the best fit out of the available [scopes](https://docs.github.com/en/developers/apps/building-oauth-apps/scopes-for-oauth-apps#available-scopes)).

Co-authored-by: Aviv <avivnt@gmail.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-16 12:56:02 +00:00
Clémentine Urquizar
ead1ec3396 Update download-latest.sh 2021-10-16 14:55:10 +02:00
Clémentine Urquizar
306a8cd059 Update download-latest.sh 2021-10-16 14:55:06 +02:00
Patryk Krawaczyński
4c50deb4b7 2 RUNs less. 2021-10-16 11:37:01 +02:00
John Braun
be75426e64 Apply formatting according code style guidelines 2021-10-15 21:32:29 +02:00
Patryk Krawaczyński
23458de588 One RUN less
Align apk add commands between images.
2021-10-15 21:18:31 +02:00
bors[bot]
9fd849d48b Merge #1808
1808: Add Milestone Check status to bors.toml r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-14 13:09:32 +00:00
bors[bot]
2b28bc9510 Merge #1693
1693: Remove dataset r=Kerollmops a=curquiza

Fixes https://github.com/meilisearch/MeiliSearch/issues/1230

⚠️ Should be merged once https://github.com/meilisearch/documentation/pull/1109 is merged! ⚠️

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-14 12:56:23 +00:00
bors[bot]
d107b3f46c Merge #1759
1759: Feature docker as non root r=curquiza a=igaul

This closes #1757 . 
Adding a non root user with default name meiliuser.

Co-authored-by: gaul@pdx.edu <gaul@pdx.edu>
Co-authored-by: igaul <40813772+igaul@users.noreply.github.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-14 12:45:49 +00:00
Clémentine Urquizar
44149bec60 Merge branch 'main' into feature-docker-as-non-root 2021-10-14 14:45:28 +02:00
Clémentine Urquizar
f80b4fdedd Use pr_status isntead of status 2021-10-14 14:21:42 +02:00
bors[bot]
fd4a90549b Merge #1803
1803: Import hotfix from `stable` into `main` (v0.23.1) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
2021-10-14 11:43:45 +00:00
Clémentine Urquizar
b602a0836a Merge branch 'main' into stable 2021-10-14 13:43:21 +02:00
Clémentine Urquizar
7349fca607 Add Milestone Check status to bors.toml 2021-10-13 19:10:20 +02:00
bors[bot]
4bacc8e47d Merge #1806
1806: Fix csv content-type error message r=curquiza a=sanders41

Fixes #1805

I was not sure if the `application/csv` [here](23f11e355d/meilisearch-http/tests/content_type.rs (L29)) should also be changed? I'm thinking yes, but `applicaiton/csv` is a bad type.

Co-authored-by: Paul Sanders <psanders1@gmail.com>
2021-10-13 10:11:47 +00:00
igaul
7141f89c5f Split entrypoint and cmd 2021-10-12 11:44:59 -07:00
igaul
893654fb15 change default user name
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-12 11:42:08 -07:00
Paul Sanders
c9e1d054c7 Fix csv content-type error 2021-10-12 13:38:48 -04:00
bors[bot]
2e2eeb0a42 Merge #1801
1801: Update milli version to v0.17.3 to fix inference issue r=curquiza a=curquiza

Fixes #1798

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-12 14:47:06 +00:00
Clémentine Urquizar
0f342ac46e Update MeiliSearch version 2021-10-12 16:43:31 +02:00
Clémentine Urquizar
29ac324e90 Update milli version to v0.17.3 2021-10-12 16:12:16 +02:00
bors[bot]
23f11e355d Merge #1799
1799: Update README.md with Telemetry page r=curquiza a=maryamsulemani97

Updated readme to link the Telemetry page in the documentation 

Co-authored-by: maryamsulemani97 <90181761+maryamsulemani97@users.noreply.github.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-12 11:50:14 +00:00
Clémentine Urquizar
f09016b2bc Update README.md 2021-10-12 13:49:31 +02:00
Clémentine Urquizar
1fa3aeceeb Update README.md 2021-10-12 13:47:38 +02:00
maryamsulemani97
443afdc412 Update README.md 2021-10-12 14:37:19 +04:00
bors[bot]
776befc1f0 Merge #1797
1797: Import stable into main (v0.22.0) r=MarinPostma a=curquiza



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-11 16:43:01 +00:00
Clémentine Urquizar
3edbc74430 Merge branch 'main' into stable 2021-10-11 18:30:10 +02:00
bors[bot]
3172c96042 Merge #1795
1795: Update milli version r=irevoire a=curquiza

Closes #1788 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-11 14:27:46 +00:00
Clémentine Urquizar
60473637fe Update milli version 2021-10-11 16:21:19 +02:00
bors[bot]
b969f34317 Merge #1793
1793: Remove memmap dependency r=curquiza a=palfrey

Fixes #1792. I was going to replace with [memmap2](https://github.com/RazrFalcon/memmap2-rs) which should be a drop-in replacement, but I couldn't actually find anything that actually directly used it. It ends up being a dependency in [milli](https://github.com/meilisearch/milli) so I'm going to go there next and fix that.

Co-authored-by: Tom Parker-Shemilt <palfrey@tevp.net>
2021-10-11 08:29:40 +00:00
Tom Parker-Shemilt
6c46fbbc57 Remove memmap dependency 2021-10-10 22:33:40 +01:00
Patrick Dung
87115b02d9 Fixing the passing of environment variables 2021-10-10 03:27:51 +08:00
Patrick Dung
c614520405 Cross build with action-rs 2021-10-10 02:21:30 +08:00
John Braun
3756f5a0ca Add test for highlighting numbers 2021-10-08 15:07:45 +02:00
John Braun
5b4e4bb858 Highlight numbers (int) as string in formatted JSON 2021-10-08 15:07:15 +02:00
bors[bot]
dffd90b966 Merge #1783
1783: Fix too many open file error r=ManyTheFish a=ManyTheFish

- prepare_for_closing() function wasn't called when an index is deleted, we are now calling it
- Index wasn't deleted in the case where we couldn't insert `uid` in `index_uuid_store`, we are now cleaning it

Fix #1736

Co-authored-by: many <maxime@meilisearch.com>
2021-10-07 15:48:07 +00:00
many
a92a0c3ed3 Log the error instead of returning it when deletion fails 2021-10-07 17:38:22 +02:00
many
0774b1efa5 Close index's heed environment when index is deleted 2021-10-07 17:09:41 +02:00
many
7fc7eb7457 Make sure to remove newly created index if uid is already taken 2021-10-07 16:49:21 +02:00
bors[bot]
602a327aa8 Merge #1781
1781: Optimize build size r=irevoire a=MarinPostma

Remove debug symbols from the release build, and strip the binaries.

We used to need to debug symbols for sentry, but since it was removed with #1616, we don't need them anymore.

Shrinks the binary size from ~300MB to ~50MB on linux.

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-07 13:31:03 +00:00
mpostma
14c6ae4735 disable stripping 2021-10-07 12:10:36 +02:00
mpostma
493a0e377d optimize build size 2021-10-07 11:49:52 +02:00
bors[bot]
5e3e108143 Merge #1769
1769: Enforce `Content-Type` header for routes requiring a body payload r=MarinPostma a=irevoire

closes #1665 

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-10-06 15:43:50 +00:00
Tamo
66dbd3cd34 makes clippy happy 2021-10-06 17:39:04 +02:00
Tamo
9a1e44dc78 Apply suggestion
- remove the payload_error_handler in favor of a PayloadError::from

- merge the two match branch into one

- makes the accepted content type a const instead of recalculating it for every error
2021-10-06 17:15:47 +02:00
Tamo
37b267ffb3 duplicate the post document tests with the put verb 2021-10-06 17:15:47 +02:00
Tamo
dfa199f98f add content-type tests
fix typo

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-06 17:15:47 +02:00
Tamo
c6d107a05f makes the content-type mandatory for every routes 2021-10-06 17:15:47 +02:00
bors[bot]
ddbcf449da Merge #1763
1763: Index tests r=MarinPostma a=MarinPostma

This pr aims to test more thorougly the usage on index in the meilisearch database, by writing unit tests.

work included:
- [x] Create index mock and stub methods
- [x] Test snapshot creation
- [x] Test Dumps
- [x] Test search

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-06 14:39:53 +00:00
mpostma
9fa61439b1 fix clippy warning & unsafety 2021-10-06 14:51:46 +02:00
Clémentine Urquizar
02dd1ea29d Merge pull request #1771 from ferdi05/ferdi05-patch-contributing
Update CONTRIBUTING.md
2021-10-06 14:40:38 +02:00
mpostma
a38215de98 edit documentation 2021-10-06 14:35:18 +02:00
mpostma
85b5260d9d simple search unit test 2021-10-06 14:20:05 +02:00
mpostma
4b4ebad9a9 test dumps 2021-10-06 14:10:26 +02:00
mpostma
ece4c739f4 update store tests 2021-10-06 14:10:26 +02:00
mpostma
85ae34cf9f test snapshots 2021-10-06 14:10:23 +02:00
mpostma
0448f0ce56 handle panic in stubs 2021-10-06 14:09:04 +02:00
mpostma
4835d82a0b implement index mock 2021-10-06 14:09:01 +02:00
Ferdinand Boas
eaddee9fe2 Update CONTRIBUTING.md
typo + text improvement
all credits go to @guimachiavelli
2021-10-05 18:07:59 +02:00
bors[bot]
2190764162 Merge #1768
1768: Fix auth error r=irevoire a=MarinPostma

fix a small auth error, that set the invalid token error token to "hello". This was invilisble to the user because the invalid token is not returned.

thank you hawk-eye `@irevoire` 

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-05 15:16:14 +00:00
mpostma
3b91764587 fix auth error 2021-10-05 09:09:40 +02:00
bors[bot]
0c3ec549f8 Merge #1764
1764: Bump Milli to improve geosearch errors r=curquiza a=irevoire

closes #1734 

`@curquiza,` your two examples still don't work because a filter must be composed of multiples operations; look at my screenshot to see what works and what doesn't.
Is this ok? 🤔 

`@gmourier,`
What do you think?

![image](https://user-images.githubusercontent.com/7032172/135846911-588f652d-16db-4d88-89fd-148640bac0f7.png)


And here is a screenshot with all the new errors that have been implemented

![image](https://user-images.githubusercontent.com/7032172/135854851-da469fef-0dd0-4ff1-b15e-89934ed8fb6f.png)

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-10-04 17:30:22 +00:00
Tamo
fca686e7f8 bump meilisearch 2021-10-04 13:52:37 +02:00
bors[bot]
607e28749a Merge #1755
1755: Fix mini dashboard r=curquiza a=anirudhRowjee

This commit is a fix to issue #1750.
As a part of the changes to solve this issue, the following changes have
been made -
1. Route registration for static assets has been modified
2. the `mut` keyword on the `scope` has been removed.

Co-authored-by: Anirudh Rowjee <ani.rowjee@gmail.com>
2021-10-04 09:56:21 +00:00
Anirudh Rowjee
bffab21b10 Changes
1. Removed redundant scope registration
2021-10-04 14:47:05 +05:30
Aviv
d9165c7f77 Add option to use enviroment variable to increase rate limit 2021-10-03 13:07:40 +03:00
gaul@pdx.edu
2ef58ccce9 Fix formatting 2021-10-02 10:59:01 -07:00
gaul@pdx.edu
4009804221 Creates non root user to run Meilisearch in Dockerfile 2021-10-02 10:43:13 -07:00
Anirudh Rowjee
151f691609 Fixes #1750
This commit is a fix to issue #1750.
As a part of the changes to solve this issue, the following changes have
been made -
1. Route registration for static assets has been modified
2. the `mut` keyword on the `scope` has been removed.
2021-10-02 15:24:04 +05:30
bors[bot]
81993b6a15 Merge #1747
1747: Add new error types for document additions r=curquiza a=MarinPostma

Adds the missing errors for the documents routes, as specified.

close #1691
close #1690


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-30 15:16:44 +00:00
mpostma
4eb3817b03 missing payload error 2021-09-30 16:58:13 +02:00
mpostma
18cb514073 invalid content type error 2021-09-30 16:58:13 +02:00
mpostma
ddd40d87a7 malformed payload error 2021-09-30 16:58:13 +02:00
mpostma
137272b8de empty content type error 2021-09-30 16:58:13 +02:00
bors[bot]
e400ae900d Merge #1746
1746: Do not commit transaction on failed updates r=irevoire a=Kerollmops

This PR fixes MeiliSearch that was always committing the transactions even when an update was invalid and the whole transaction should have been trashed. It was the source of a bug where an invalid update (with an invalid primary key) was creating an index with the specified primary key and should instead have failed and done nothing on the server.

Fixes #1735.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-09-30 13:46:43 +00:00
Kerollmops
c388dca5ec Check that invalid updates do not create an index with a primary key 2021-09-30 15:46:04 +02:00
Kerollmops
6a691db7f8 Do not commit transaction on failed updates 2021-09-30 15:46:03 +02:00
bors[bot]
ed783b67ca Merge #1742
1742: Create dumps v3 r=irevoire a=MarinPostma

The introduction of the obkv document format has changed the format of the updates, by removing the need for the document format of the addition (it is not necessary since update are store in the obkv format). This has caused breakage in the dumps that this pr solves by introducing a 3rd version of the dumps.

A v2 compat layer has been created that support the import of v2 dumps into meilisearch. This has permitted to move the compat code that existed elsewhere in meiliearch to be moved into the v2 module. The asc/desc patching is now only done for forward compatibility when loading a v2 dump, and the v3 write the asc/desc to the dump with the new syntax.




Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-30 13:17:30 +00:00
mpostma
ee372a7b30 implement new dump v2 2021-09-30 14:49:13 +02:00
mpostma
66f39aaa92 fix dump v3 2021-09-30 14:49:13 +02:00
mpostma
03af99650d fix dumpv1 2021-09-30 14:49:13 +02:00
bors[bot]
44a2ff07b1 Merge #1697
1697: Make exec binary for M1 mac available for download r=irevoire a=k-nasa

## Why

fix: https://github.com/meilisearch/MeiliSearch/issues/1661

Now, Do not supported getting exec file for m1 mac on using`download-latest.sh`.


## What

Download x86 binary when run `download-latest.sh` on m1 mac, because it can execute binary targeting x86.

## Proof

I verified like this.
I got executable binary on M1 mac 💡 

```sh
:) % arch
arm64

:) % ./download-latest.sh
Downloading MeiliSearch binary v0.21.1 for macos, architecture amd64...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   631  100   631    0     0   2035      0 --:--:-- --:--:-- --:--:--  2035
100 43.5M  100 43.5M    0     0  7826k      0  0:00:05  0:00:05 --:--:-- 9123k
MeiliSearch binary successfully downloaded as 'meilisearch' file.

Run it:
    $ ./meilisearch
Usage:
    $ ./meilisearch --help

:) % file ./meilisearch
./meilisearch: Mach-O 64-bit executable x86_64

:) % ./meilisearch --help # this is execuable
meilisearch-http 0.21.1
...
...
```

Co-authored-by: k-nasa <htilcs1115@gmail.com>
Co-authored-by: nasa <htilcs1115@gmail.com>
2021-09-30 11:33:48 +00:00
bors[bot]
fb95540394 Merge #1748
1748: Add a link to join the cloud-hosted beta r=MarinPostma a=gmourier

The product team would like to add a link to communicate and invite users to fill out the form to test the closed beta of our cloud solution.

We have done the same thing on the documentation side https://github.com/meilisearch/documentation/pull/1148. 😇

Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
2021-09-30 09:42:22 +00:00
Guillaume Mourier
be00fafb29 Add a link to join the cloud-hosted beta 2021-09-30 11:28:51 +02:00
bors[bot]
0bc376a17b Merge #1738
1738: Update Dockerfile following the refactor r=MarinPostma a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-30 08:08:19 +00:00
bors[bot]
05d5de47cb Merge #1737
1737: Update version for the next release (v0.23.0) r=irevoire a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-29 22:06:46 +00:00
bors[bot]
d3eb604d66 Merge #1740
1740: Add Hacktoberfest section to CONTRIBUTING.md r=curquiza a=meili-bot

_This PR is auto-generated._

Add Hacktoberfest section to CONTRIBUTING.md


Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2021-09-29 17:55:47 +00:00
meili-bot
d6db210ef3 Update CONTRIBUTING.md 2021-09-29 19:54:12 +02:00
bors[bot]
80ca42922f Merge #1739
1739: Fix add document Content-Type r=curquiza a=MarinPostma

change the `Content-Type` guards of the document addition routes to match the specification.


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-29 17:13:45 +00:00
mpostma
fe5df6d06f fix payload content type guards 2021-09-29 19:04:47 +02:00
Clémentine Urquizar
535442179e Update Dockerfile following the refactor 2021-09-29 18:54:14 +02:00
Clémentine Urquizar
b17dae9ac0 Update version for the next release (v0.23.0) 2021-09-29 18:40:35 +02:00
bors[bot]
5fad37aebd Merge #1711
1711: MeiliSearch refactor introducing OBKV format r=MarinPostma a=MarinPostma

This PR refactor some multiple components of meilisearch, and introduce the obkv document format to meilisearch

- [x] Split meilisearch-http and meilisearch-lib
- [x] Replace `IndexActor` and `UuidResolver` with `IndexResolver`
- [x] Remove mentions to Actor
- [x] Remove Actor traits to simplify code
- [x] Integrate obkv document format
- [x] Remove `Data`
- [x] Restore all route
- [x] Replace `Box<dyn error>` with `anyhow::Error`
- [x] Introduce update file store
- [x] Update file store error handling
- [x] Fix dumps
- [x] Fix snapshots
- [x] Fix tests
- [x] Update module documentation
- [x] add csv suppport (feat `@ManyTheFish` #1729 )
- [x] add jsonl support
- [x] integrate geosearch (feat `@irevoire` #1725) 

partially implements #1691 and #1690. The error handling is very basic now, I will finish it in the next pr.

Some unit tests have been disabled, I will re-enable them ASAP, but they need a bit more work.

close #1531 


P.S: sorry for this monstrous PR :'(

Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: many <maxime@meilisearch.com>
2021-09-29 14:38:55 +00:00
mpostma
311933614e bump milli to v0.17.0 2021-09-29 15:44:54 +02:00
mpostma
8fa6502b16 review changes 2021-09-29 14:17:41 +02:00
mpostma
1f537e1b60 jsonl support 2021-09-29 11:28:02 +02:00
mpostma
5bac65f8b8 add missing content type errors 2021-09-29 09:55:35 +02:00
mpostma
911630000f split csv and json document routes 2021-09-29 00:12:25 +02:00
mpostma
6e8a3fe8de move csv parsing to document_formats 2021-09-28 22:58:48 +02:00
many
2a14948123 Use an existing revision of milli 2021-09-28 22:30:34 +02:00
many
61e5eed493 Call csv specialized function 2021-09-28 22:29:26 +02:00
many
d30830a55c Add csv deserializer for documents 2021-09-28 22:28:13 +02:00
mpostma
102c46f88b clippy + fmt 2021-09-28 22:22:59 +02:00
mpostma
5fa9bc67d7 remove unused dependencies 2021-09-28 22:16:18 +02:00
mpostma
3503fbf7fe re-export milli from meilisearch_lib 2021-09-28 22:08:03 +02:00
mpostma
1cc733f801 fix get_info 2021-09-28 22:02:04 +02:00
mpostma
7a27cbcc78 rename RegisterUpdate to store::Update 2021-09-28 20:20:13 +02:00
mpostma
6f8e670dee move json reader to document_formats module 2021-09-28 20:13:26 +02:00
mpostma
df4e9f4e1e restore dump v1 2021-09-28 19:49:25 +02:00
mpostma
3747f5bdd8 replace unwraps with correct error 2021-09-28 19:29:14 +02:00
mpostma
56766cffc3 remove module level doc 2021-09-28 18:58:56 +02:00
mpostma
692c676625 fix tests 2021-09-28 18:57:36 +02:00
Tamo
ddfd7def35 add a TODO while waiting for the tests to be fixed 2021-09-28 18:17:56 +02:00
mpostma
bcaee4d179 fix uuid store size 2021-09-28 18:17:56 +02:00
Tamo
539a57026d fix the sort error messages 2021-09-28 14:50:26 +02:00
Tamo
654f49ccec [WIP] put milli on branch main 2021-09-28 14:50:26 +02:00
Tamo
c1376a9f2a add the geosearch to Meilisearch 2021-09-28 14:50:26 +02:00
mpostma
9ac999ca59 remove uuid resolver and index actor 2021-09-28 12:00:35 +02:00
mpostma
6a1964f146 restore dumps 2021-09-28 11:59:55 +02:00
mpostma
90018755c5 restore snapshots 2021-09-27 16:48:03 +02:00
bors[bot]
95211e2665 Merge #1703
1703: Trigger CodeCoverage manually instead of on each PR r=irevoire a=curquiza

Since no one is using it now on the PRs, we would rather get a state of the code coverage once (triggered manually) rather than on each PR.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-27 13:55:05 +00:00
bors[bot]
7cd94e5486 Merge #1724
1724: Redo CONTRIBUTING.md r=curquiza a=curquiza

- Update `Development` section
- Update the `Git Guidelines` section
- Remove `Benchmarking & Profiling` -> done on the milli side at the moment
- Remove `Humans` -> synchronization job done by the manager of the core team at the moment
- Remove `Changelog` section -> done by the manager and the docs team 
- Remove `Documentation` section -> job done by the manager to synchronize both teams.

Fixes #1723 at the same time

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-27 13:20:29 +00:00
Clémentine Urquizar
35ef6a9204 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:42:56 +02:00
Clémentine Urquizar
41272e7148 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:42:38 +02:00
Clémentine Urquizar
8ff39d8432 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:42:28 +02:00
Clémentine Urquizar
e22f57cae5 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:32:47 +02:00
Clémentine Urquizar
67afb0e3fe Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:30:53 +02:00
Clémentine Urquizar
f56f31c277 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:30:42 +02:00
Clémentine Urquizar
b7c4754be2 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:30:37 +02:00
Clémentine Urquizar
3118f32221 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:30:30 +02:00
Clémentine Urquizar
c9cc504e7b Update CONTRIBUTING.md 2021-09-27 14:18:15 +02:00
mpostma
b9d189bf12 restore document deletion routes 2021-09-24 15:21:07 +02:00
mpostma
c32012c44a restore settings updates 2021-09-24 14:55:57 +02:00
mpostma
dfce44fa3b rename data to meilisearch 2021-09-24 12:03:16 +02:00
mpostma
42a6260b65 introduce index resolver 2021-09-24 11:53:11 +02:00
mpostma
5353be74c3 refactor index actor 2021-09-22 15:07:04 +02:00
mpostma
12542bf922 refactor update actor 2021-09-22 11:52:50 +02:00
mpostma
def737edee refactor uuid resolver 2021-09-22 10:49:59 +02:00
mpostma
60518449fc split meilisearch-http and meilisearch-lib 2021-09-21 13:23:22 +02:00
mpostma
09d4e37044 split data and api keys 2021-09-20 15:31:03 +02:00
mpostma
e14640e530 refactor meilisearch 2021-09-20 14:54:20 +02:00
k-nasa
7f734f0a18 get x84 binary 2021-09-20 20:57:47 +09:00
nasa
cd2f886234 Update download-latest.sh
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-19 23:46:36 +09:00
nasa
dda4fe10b3 Update download-latest.sh
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-19 23:46:31 +09:00
bors[bot]
148a896cca Merge #1666
1666: Proposed fix for ARM binary on RHEL r=irevoire a=kappa-wingman

https://github.com/meilisearch/MeiliSearch/issues/1410

Co-authored-by: Kappa Wingman <64772920+kappa-wingman@users.noreply.github.com>
2021-09-15 14:16:25 +00:00
bors[bot]
e8eb589d6e Merge #1659
1659: deps: unify pest dependency r=MarinPostma a=happysalada

meilisearch dependends on two different versions of pest.
This can be problematic for some build systems (e.g. NixOS).
Since the repo hasn't received an update in a while, in the meantime, use the later version of the two pest dependencies.

Context: this has been discussed previously https://github.com/meilisearch/MeiliSearch/issues/1273
meilisearch has been selected by ngi to be packaged for nixos. A patch can be applied to make the changes proposed in this PR. This PR intends to see how the maintainers of meilisearch would feel about the patch.

What was done.
- Add an override for the pest dependency in Cargo.toml.
- recreate the Cargo.lock with `cargo update`. This has had the side effect of updating some dependencies.

I ran the tests on darwin. My machine is quite old so I had 8 failures due to a timeout. None of the failures look like they are due to the new dependencies.

Checking the pest repo, it seems there are some recent commits, however no sure date of when there could be a new release.

If this gets accepted, there is no need to do a new release, nixos can just target the new commit.

If you feel it's too much pain for not enough gain, no worries at all!

Co-authored-by: happysalada <raphael@megzari.com>
2021-09-15 07:56:03 +00:00
happysalada
770b6d25ae deps: unify pest dependency 2021-09-15 12:15:44 +09:00
Clémentine Urquizar
be10d90f07 Trigger CodeCoverage manually instead of on each PR 2021-09-14 18:59:20 +02:00
bors[bot]
fd3fa1ef45 Merge #1692
1692: Use tikv-jemallocator instead of jemallocator r=curquiza a=felixonmars

`jemallocator` has been abandoned for nearly two years, and `rustc`
itself moved to use `tikv-jemallocator` instead:
3965773ae7

Let's switch to a better maintained version.

Co-authored-by: Felix Yan <felixonmars@archlinux.org>
2021-09-14 16:26:27 +00:00
Felix Yan
a57943b77e Use tikv-jemallocator instead of jemallocator
`jemallocator` has been abandoned for nearly two years, and `rustc`
itself moved to use `tikv-jemallocator` instead:
3965773ae7

Let's switch to a better maintained version.
2021-09-14 18:30:24 +03:00
bors[bot]
6fafdb7711 Merge #1651 #1676 #1684
1651: Use reset_sortable_fields r=Kerollmops a=shekhirin

Resolves https://github.com/meilisearch/MeiliSearch/issues/1635

1676: Add curl binary to final stage image r=curquiza a=ook

Reference: #1673 

Changes: * add `curl` binary to final docker Melisearch image.

For metrics, docker funny layer management makes this add a  shrink from 319MB to 315MB:

```
☁  MeiliSearch [feature/1673-add-curl-to-docker-image]  docker image ls
REPOSITORY                                                         TAG                  IMAGE ID       CREATED         SIZE
getmeili/meilisearch                                               0.22.0_ook_1673      938e239ad989   2 hours ago     315MB
getmeili/meilisearch                                               latest               258fa3aa1230   6 days ago      319MB
```

1684: bump dependencies r=MarinPostma a=MarinPostma

Bump meilisearch dependencies.

We still depend on custom patch that have been upgraded along the way.

Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
Co-authored-by: Thomas Lecavelier <thomas@followanalytics.com>
Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-13 13:20:29 +00:00
mpostma
0f7625e29a bump dependencies 2021-09-13 15:17:08 +02:00
bors[bot]
7afccfc92d Merge #1683
1683: Better dependencies cache for CI r=curquiza a=shekhirin



Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
2021-09-13 12:48:35 +00:00
bors[bot]
6e59da26d4 Merge #1700
1700: `stable` into `main` after v0.22.0 r=MarinPostma a=curquiza



Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-13 12:32:23 +00:00
k-nasa
5b995ba080 feat: Make exec binary for M1 mac available for download 2021-09-11 23:11:33 +09:00
Clémentine Urquizar
168a1315de Remove dataset 2021-09-10 10:35:10 +02:00
Alexey Shekhirin
bbb012ad0f chore(ci): use smarter dependencies cache 2021-09-07 19:56:38 +03:00
Alexey Shekhirin
efa69875d9 refactor(http): use reset_sortable_fields 2021-09-07 15:04:44 +03:00
Thomas Lecavelier
59797bfc7b meilisearch/MeiliSearch#1673 Add curl binary to final stage image 2021-09-06 19:35:23 +02:00
Kappa Wingman
b444e2e074 Proposed fix for ARM binary on RHEL
https://github.com/meilisearch/MeiliSearch/issues/1410
2021-09-02 01:01:46 +00:00
160 changed files with 15527 additions and 27233 deletions

View File

@@ -1,13 +0,0 @@
name-template: 'v$RESOLVED_VERSION'
tag-template: 'v$RESOLVED_VERSION'
version-template: '0.21.0-alpha.$PATCH'
exclude-labels:
- 'skip-changelog'
template: |
## Changes
$CHANGES
no-changes-template: 'Changes are coming soon 😎'
sort-direction: 'ascending'
version-resolver:
default: patch

View File

@@ -1,7 +1,6 @@
---
on:
pull_request:
types: [review_requested, ready_for_review]
workflow_dispatch:
name: Execute code coverage

View File

@@ -9,6 +9,7 @@ jobs:
name: Publish for ${{ matrix.os }}
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-18.04, macos-latest, windows-latest]
include:
@@ -42,11 +43,13 @@ jobs:
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v2
- uses: uraimo/run-on-arch-action@v1.0.7
- uses: uraimo/run-on-arch-action@v2.1.1
id: runcmd
with:
architecture: aarch64 # aka ARMv8
distribution: ubuntu18.04
arch: aarch64 # aka ARMv8
distro: ubuntu18.04
env: |
JEMALLOC_SYS_WITH_LG_PAGE: 16
run: |
apt update
apt install -y curl gcc make

View File

@@ -0,0 +1,76 @@
name: Publish aarch64 binary
on:
release:
types: [published]
env:
CARGO_TERM_COLOR: always
jobs:
publish-aarch64:
name: Publish to Github
runs-on: ${{ matrix.os }}
continue-on-error: false
strategy:
fail-fast: false
matrix:
include:
- build: aarch64
os: ubuntu-18.04
target: aarch64-unknown-linux-gnu
linker: gcc-aarch64-linux-gnu
use-cross: true
asset_name: meilisearch-linux-aarch64
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Installing Rust toolchain
uses: actions-rs/toolchain@v1
with:
toolchain: stable
profile: minimal
target: ${{ matrix.target }}
override: true
- name: APT update
run: |
sudo apt update
- name: Install target specific tools
if: matrix.use-cross
run: |
sudo apt-get install -y ${{ matrix.linker }}
- name: Configure target aarch64 GNU
if: matrix.target == 'aarch64-unknown-linux-gnu'
## Environment variable is not passed using env:
## LD gold won't work with MUSL
# env:
# JEMALLOC_SYS_WITH_LG_PAGE: 16
# RUSTFLAGS: '-Clink-arg=-fuse-ld=gold'
run: |
echo '[target.aarch64-unknown-linux-gnu]' >> ~/.cargo/config
echo 'linker = "aarch64-linux-gnu-gcc"' >> ~/.cargo/config
echo 'JEMALLOC_SYS_WITH_LG_PAGE=16' >> $GITHUB_ENV
echo RUSTFLAGS="-Clink-arg=-fuse-ld=gold" >> $GITHUB_ENV
- name: Cargo build
uses: actions-rs/cargo@v1
with:
command: build
use-cross: ${{ matrix.use-cross }}
args: --release --target ${{ matrix.target }}
- name: List target output files
run: ls -lR ./target
- name: Upload the binary to release
uses: svenstaro/upload-release-action@v1-release
with:
repo_token: ${{ secrets.PUBLISH_TOKEN }}
file: target/${{ matrix.target }}/release/meilisearch
asset_name: ${{ matrix.asset_name }}
tag: ${{ github.ref }}

View File

@@ -0,0 +1,105 @@
name: Publish images to Docker Hub
on:
push:
# Will run for every tag pushed except `latest`
# When the `latest` git tag is created with this [CI](../latest-git-tag.yml)
# we don't need to create a Docker `latest` image again.
# The `latest` Docker image push is already done in this CI when releasing a stable version of Meilisearch.
tags-ignore:
- latest
# Both `schedule` and `workflow_dispatch` build the nightly tag
schedule:
- cron: '0 23 * * *' # Every day at 11:00pm
workflow_dispatch:
jobs:
docker:
runs-on: docker
steps:
- uses: actions/checkout@v3
# If we are running a cron or manual job ('schedule' or 'workflow_dispatch' event), it means we are publishing the `nightly` tag, so not considered stable.
# If we have pushed a tag, and the tag has the v<nmumber>.<number>.<number> format, it means we are publishing an official release, so considered stable.
# In this situation, we need to set `output.stable` to create/update the following tags (additionally to the `vX.Y.Z` Docker tag):
# - a `vX.Y` (without patch version) Docker tag
# - a `latest` Docker tag
# For any other tag pushed, this is not considered stable.
- name: Define if stable and latest release
id: check-tag-format
env:
# To avoid request limit with the .github/scripts/is-latest-release.sh script
GITHUB_PATH: ${{ secrets.MEILI_BOT_GH_PAT }}
run: |
escaped_tag=$(printf "%q" ${{ github.ref_name }})
echo "latest=false" >> $GITHUB_OUTPUT
if [[ ${{ github.event_name }} != 'push' ]]; then
echo "stable=false" >> $GITHUB_OUTPUT
elif [[ $escaped_tag =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "stable=true" >> $GITHUB_OUTPUT
echo "latest=$(sh .github/scripts/is-latest-release.sh)" >> $GITHUB_OUTPUT
else
echo "stable=false" >> $GITHUB_OUTPUT
fi
# Check only the validity of the tag for stable releases (not for pre-releases or other tags)
- name: Check release validity
if: steps.check-tag-format.outputs.stable == 'true'
run: bash .github/scripts/check-release.sh
- name: Set build-args for Docker buildx
id: build-metadata
run: |
# Extract commit date
commit_date=$(git show -s --format=%cd --date=iso-strict ${{ github.sha }})
echo "date=$commit_date" >> $GITHUB_OUTPUT
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Docker meta
id: meta
uses: docker/metadata-action@v4
with:
images: getmeili/meilisearch
# Prevent `latest` to be updated for each new tag pushed.
# We need latest and `vX.Y` tags to only be pushed for the stable Meilisearch releases.
flavor: latest=false
tags: |
type=ref,event=tag
type=raw,value=nightly,enable=${{ github.event_name != 'push' }}
type=semver,pattern=v{{major}}.{{minor}},enable=${{ steps.check-tag-format.outputs.stable == 'true' }}
type=raw,value=latest,enable=${{ steps.check-tag-format.outputs.stable == 'true' && steps.check-tag-format.outputs.latest == 'true' }}
- name: Build and push
uses: docker/build-push-action@v4
with:
push: true
platforms: linux/amd64,linux/arm64
tags: ${{ steps.meta.outputs.tags }}
build-args: |
COMMIT_SHA=${{ github.sha }}
COMMIT_DATE=${{ steps.build-metadata.outputs.date }}
GIT_TAG=${{ github.ref_name }}
# /!\ Don't touch this without checking with Cloud team
- name: Send CI information to Cloud team
# Do not send if nightly build (i.e. 'schedule' or 'workflow_dispatch' event)
if: github.event_name == 'push'
uses: peter-evans/repository-dispatch@v2
with:
token: ${{ secrets.MEILI_BOT_GH_PAT }}
repository: meilisearch/meilisearch-cloud
event-type: cloud-docker-build
client-payload: '{ "meilisearch_version": "${{ github.ref_name }}", "stable": "${{ steps.check-tag-format.outputs.stable }}" }'

View File

@@ -1,22 +0,0 @@
---
on:
release:
types: [released]
name: Publish latest image to Docker Hub
jobs:
build:
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v2
- name: Check if current release is latest
run: echo "##[set-output name=is_latest;]$(sh .github/is-latest-release.sh)"
id: release
- name: Publish to Registry
if: steps.release.outputs.is_latest == 'true'
uses: elgohr/Publish-Docker-Github-Action@master
with:
name: getmeili/meilisearch
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}

View File

@@ -1,22 +0,0 @@
---
on:
push:
tags:
- '*'
name: Publish tagged image to Docker Hub
jobs:
build:
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v2
- name: Publish to Registry
uses: elgohr/Publish-Docker-Github-Action@master
env:
COMMIT_SHA: ${{ github.sha }}
with:
name: getmeili/meilisearch
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
tag_names: true

View File

@@ -1,16 +0,0 @@
name: Release Drafter
on:
push:
branches:
- main
jobs:
update_release_draft:
runs-on: ubuntu-latest
steps:
- uses: release-drafter/release-drafter@v5
with:
config-name: release-draft-template.yml
env:
GITHUB_TOKEN: ${{ secrets.RELEASE_DRAFTER_TOKEN }}

View File

@@ -11,6 +11,7 @@ on:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
jobs:
tests:
@@ -23,12 +24,7 @@ jobs:
steps:
- uses: actions/checkout@v2
- name: Cache dependencies
uses: actions/cache@v2
with:
path: |
~/.cargo
./target
key: ${{ matrix.os }}-${{ hashFiles('Cargo.lock') }}
uses: Swatinem/rust-cache@v1.3.0
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
with:
@@ -45,19 +41,14 @@ jobs:
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v2
- name: Cache dependencies
uses: actions/cache@v2
with:
path: |
~/.cargo
./target
key: ${{ matrix.os }}-${{ hashFiles('Cargo.lock') }}
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
components: clippy
- name: Cache dependencies
uses: Swatinem/rust-cache@v1.3.0
- name: Run cargo clippy
uses: actions-rs/cargo@v1
with:
@@ -69,18 +60,13 @@ jobs:
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v2
- name: Cache dependencies
uses: actions/cache@v2
with:
path: |
~/.cargo
./target
key: ${{ matrix.os }}-${{ hashFiles('Cargo.lock') }}
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: nightly
override: true
components: rustfmt
- name: Cache dependencies
uses: Swatinem/rust-cache@v1.3.0
- name: Run cargo fmt
run: cargo fmt --all -- --check

View File

@@ -1,112 +1,80 @@
# Contributing
First, thank you for contributing to MeiliSearch! The goal of this document is to
provide everything you need to start contributing to MeiliSearch. The
following TOC is sorted progressively, starting with the basics and
expanding into more specifics.
First, thank you for contributing to MeiliSearch! The goal of this document is to provide everything you need to start contributing to MeiliSearch.
<!-- MarkdownTOC autolink="true" style="ordered" indent=" " -->
1. [Assumptions](#assumptions)
1. [Your First Contribution](#your-first-contribution)
1. [Change Control](#change-control)
1. [Git Branches](#git-branches)
1. [Git Commits](#git-commits)
1. [Style](#style)
1. [Github Pull Requests](#github-pull-requests)
1. [Reviews & Approvals](#reviews--approvals)
1. [Merge Style](#merge-style)
1. [CI](#ci)
1. [Development](#development)
1. [Setup](#setup)
1. [Testing](#testing)
1. [Benchmarking](#benchmarking--profiling)
1. [Humans](#humans)
1. [Documentation](#documentation)
1. [Changelog](#changelog)
<!-- /MarkdownTOC -->
- [Assumptions](#assumptions)
- [How to Contribute](#how-to-contribute)
- [Development Workflow](#development-workflow)
- [Git Guidelines](#git-guidelines)
## Assumptions
1. **You're familiar with [Github](https://github.com) and the [pull request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)
workflow.**
2. **You've read the MeiliSearch [docs](https://docs.meilisearch.com).**
1. **You're familiar with [Github](https://github.com) and the [Pull Requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)(PR) workflow.**
2. **You've read the MeiliSearch [documentation](https://docs.meilisearch.com).**
3. **You know about the [MeiliSearch community](https://docs.meilisearch.com/learn/what_is_meilisearch/contact.html).
Please use this for help.**
## Your First Contribution
## How to Contribute
1. Ensure your change has an issue! Find an
[existing issue](https://github.com/meilisearch/meilisearch/issues/) or [open a new issue](https://github.com/meilisearch/meilisearch/issues/new).
* This is where you can get a feel if the change will be accepted or not.
2. Once approved, [fork the MeiliSearch repository](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) in your own
Github account.
2. Once approved, [fork the MeiliSearch repository](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) in your own Github account.
3. [Create a new Git branch](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-and-deleting-branches-within-your-repository)
4. Review the MeiliSearch [workflow](#workflow) and [development](#development).
5. Make your changes.
6. [Submit the branch as a pull request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request-from-a-fork) to the main MeiliSearch
repo. A MeiliSearch team member should comment and/or review your pull request
with a few days. Although, depending on the circumstances, it may take
longer.
4. Review the [Development Workflow](#development-workflow) section that describes the steps to maintain the repository.
5. Make your changes on your branch.
6. [Submit the branch as a Pull Request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request-from-a-fork) pointing to the `main` branch of the MeiliSearch repository. A maintainer should comment and/or review your Pull Request within a few days. Although depending on the circumstances, it may take longer.
## Change Control
## Development Workflow
### Setup and run MeiliSearch
```bash
cargo run --release
```
We recommend using the `--release` flag to test the full performance of MeiliSearch.
### Test
```bash
cargo test
```
If you get a "Too many open files" error you might want to increase the open file limit using this command:
```bash
ulimit -Sn 3000
```
## Git Guidelines
### Git Branches
_All_ changes must be made in a branch and submitted as [pull requests](#pull-requests).
MeiliSearch does not adopt any type of branch naming style, but please use something
descriptive of your changes.
All changes must be made in a branch and submitted as PR.
We do not enforce any branch naming style, but please use something descriptive of your changes.
### Git Commits
#### Style
As minimal requirements, your commit message should:
- be capitalized
- not finish by a dot or any other punctuation character (!,?)
- start with a verb so that we can read your commit message this way: "This commit will ...", where "..." is the commit message.
e.g.: "Fix the home page button" or "Add more tests for create_index method"
Please ensure your commits are small and focused; they should tell a story of
your change. This helps reviewers to follow your changes, especially for more
complex changes.
Familiarise yourself with [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/).
We don't follow any other convention, but if you want to use one, we recommend [the Chris Beams one](https://chris.beams.io/posts/git-commit/).
### Github Pull Requests
Once your changes are ready you must submit your branch as a pull request.
Some notes on GitHub PRs:
#### Reviews & Approvals
- All PRs must be reviewed and approved by at least one maintainer.
- The PR title should be accurate and descriptive of the changes.
- [Convert your PR as a draft](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/changing-the-stage-of-a-pull-request) if your changes are a work in progress: no one will review it until you pass your PR as ready for review.<br>
The draft PRs are recommended when you want to show that you are working on something and make your work visible.
- The branch related to the PR must be **up-to-date with `main`** before merging. Fortunately, this project uses [Bors](https://github.com/bors-ng/bors-ng) to automatically enforce this requirement without the PR author having to rebase manually.
All pull requests must be reviewed and approved by at least one MeiliSearch team
member.
<hr>
#### Merge Style
All pull requests are squashed and merged. We generally discourage large pull
requests that are over 300-500 lines of diff. If you would like to propose
a change that is larger we suggest coming onto our chat channel and
discuss it with one of our engineers. This way we can talk through the
solution and discuss if a change that large is even needed! This overall
will produce a quicker response to the change and likely produce code that
aligns better with our process.
## Development
### Setup
See the [MeiliSearch Docs](https://docs.meilisearch.com/reference/features/installation.html) for how to set up a development environment.
### Benchmarking & Profiling
We do not yet do any benchmarking, nor have we formalised our profiling. If you'd like to work on this please get in touch!
## Humans
After making your change, you'll want to prepare it for MeiliSearch users (mostly humans). This usually entails updating documentation and announcing your feature.
### Documentation
Documentation is very important to MeiliSearch. All contributions that
alter user-facing behavior MUST include documentation changes. Please see
[GitHub.com/meilisearch/documentation](https://github.com/meilisearch/documentation) for more info.
### Changelog
Until we have guidelines in place, updating the [`Changelog`](/CHANGELOG.md) is solely the responsibility of MeiliSearch team members.
Thank you again for reading this through, we can not wait to begin to work with you if you made your way through this contributing guide ❤️

1818
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -2,7 +2,8 @@
members = [
"meilisearch-http",
"meilisearch-error",
"meilisearch-lib",
"meilisearch-auth",
]
[profile.release]
debug = true
resolver = "2"

7
Cross.toml Normal file
View File

@@ -0,0 +1,7 @@
[build.env]
passthrough = [
"RUST_BACKTRACE",
"CARGO_TERM_COLOR",
"RUSTFLAGS",
"JEMALLOC_SYS_WITH_LG_PAGE"
]

View File

@@ -1,9 +1,8 @@
# Compile
FROM alpine:3.14 AS compiler
RUN apk update --quiet
RUN apk add curl
RUN apk add build-base
RUN apk update --quiet \
&& apk add -q --no-cache curl build-base
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
@@ -12,8 +11,10 @@ WORKDIR /meilisearch
COPY Cargo.lock .
COPY Cargo.toml .
COPY meilisearch-auth/Cargo.toml meilisearch-auth/
COPY meilisearch-error/Cargo.toml meilisearch-error/
COPY meilisearch-http/Cargo.toml meilisearch-http/
COPY meilisearch-lib/Cargo.toml meilisearch-lib/
ENV RUSTFLAGS="-C target-feature=-crt-static"
@@ -34,11 +35,14 @@ RUN $HOME/.cargo/bin/cargo build --release
# Run
FROM alpine:3.14
RUN apk add -q --no-cache libgcc tini
ENV MEILI_HTTP_ADDR 0.0.0.0:7700
ENV MEILI_SERVER_PROVIDER docker
RUN apk update --quiet \
&& apk add -q --no-cache libgcc tini curl
COPY --from=compiler /meilisearch/target/release/meilisearch .
ENV MEILI_HTTP_ADDR 0.0.0.0:7700
EXPOSE 7700/tcp
ENTRYPOINT ["tini", "--"]

View File

@@ -61,6 +61,10 @@ meilisearch
docker run -p 7700:7700 -v "$(pwd)/data.ms:/data.ms" getmeili/meilisearch
```
#### Announcing a cloud-hosted MeiliSearch
Join the closed beta by filling out this [form](https://meilisearch.typeform.com/to/FtnzvZfh).
#### Try MeiliSearch in our Sandbox
Create a MeiliSearch instance in [MeiliSearch Sandbox](https://sandbox.meilisearch.com/). This instance is free, and will be active for 48 hours.
@@ -183,7 +187,7 @@ Search and indexation are the domain of our core engine, [`milli`](https://githu
MeiliSearch collects anonymous data regarding general usage.
This helps us better understand developers' usage of MeiliSearch features.
To see what information we're retrieving, please see the complete list [on the dedicated issue](https://github.com/meilisearch/MeiliSearch/issues/720).
To find out more on what information we're retrieving, please see our documentation on [Telemetry](https://docs.meilisearch.com/learn/what_is_meilisearch/telemetry.html).
This program is optional, you can disable these analytics by using the `MEILI_NO_ANALYTICS` env variable.

33
SECURITY.md Normal file
View File

@@ -0,0 +1,33 @@
# Security
MeiliSearch takes the security of our software products and services seriously.
If you believe you have found a security vulnerability in any MeiliSearch-owned repository, please report it to us as described below.
## Suported versions
As long as we are pre-v1.0, only the latest version of MeiliSearch will be supported with security updates.
## Reporting security issues
⚠️ Please do not report security vulnerabilities through public GitHub issues. ⚠️
Instead, please kindly email us at security@meilisearch.com
Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
- Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
- Full paths of source file(s) related to the manifestation of the issue
- The location of the affected source code (tag/branch/commit or direct URL)
- Any special configuration required to reproduce the issue
- Step-by-step instructions to reproduce the issue
- Proof-of-concept or exploit code (if possible)
- Impact of the issue, including how an attacker might exploit the issue
This information will help us triage your report more quickly.
You will receive a response from us within 72 hours. If the issue is confirmed, we will release a patch as soon as possible depending on complexity.
## Preferred languages
We prefer all communications to be in English.

View File

@@ -5,5 +5,6 @@ status = [
'Run Clippy',
'Run Rustfmt'
]
pr_status = ['Milestone Check']
# 3 hours timeout
timeout-sec = 10800

View File

@@ -1 +0,0 @@
_datas in movies.json are from https://www.themoviedb.org/_

File diff suppressed because it is too large Load Diff

View File

@@ -51,13 +51,13 @@ semverLT() {
if [ $MAJOR_A -le $MAJOR_B ] && [ $MINOR_A -le $MINOR_B ] && [ $PATCH_A -lt $PATCH_B ]; then
return 0
fi
if [ "_$SPECIAL_A" == "_" ] && [ "_$SPECIAL_B" == "_" ] ; then
if [ "_$SPECIAL_A" == '_' ] && [ "_$SPECIAL_B" == '_' ] ; then
return 1
fi
if [ "_$SPECIAL_A" == "_" ] && [ "_$SPECIAL_B" != "_" ] ; then
if [ "_$SPECIAL_A" == '_' ] && [ "_$SPECIAL_B" != '_' ] ; then
return 1
fi
if [ "_$SPECIAL_A" != "_" ] && [ "_$SPECIAL_B" == "_" ] ; then
if [ "_$SPECIAL_A" != '_' ] && [ "_$SPECIAL_B" == '_' ] ; then
return 0
fi
if [ "_$SPECIAL_A" < "_$SPECIAL_B" ]; then
@@ -67,39 +67,47 @@ semverLT() {
return 1
}
# Get a token from https://github.com/settings/tokens to increasae rate limit (from 60 to 5000), make sure the token scope is set to 'public_repo'
# Create GITHUB_PAT enviroment variable once you aquired the token to start using it
# Returns the tag of the latest stable release (in terms of semver and not of release date)
get_latest() {
temp_file='temp_file' # temp_file needed because the grep would start before the download is over
curl -s 'https://api.github.com/repos/meilisearch/MeiliSearch/releases' > "$temp_file" || return 1
if [ -z "$GITHUB_PAT" ]; then
curl -s 'https://api.github.com/repos/meilisearch/MeiliSearch/releases' > "$temp_file" || return 1
else
curl -H "Authorization: token $GITHUB_PAT" -s 'https://api.github.com/repos/meilisearch/MeiliSearch/releases' > "$temp_file" || return 1
fi
releases=$(cat "$temp_file" | \
grep -E "tag_name|draft|prerelease" \
grep -E '"tag_name":|"draft":|"prerelease":' \
| tr -d ',"' | cut -d ':' -f2 | tr -d ' ')
# Returns a list of [tag_name draft_boolean prerelease_boolean ...]
# Ex: v0.10.1 false false v0.9.1-rc.1 false true v0.9.0 false false...
i=0
latest=""
current_tag=""
latest=''
current_tag=''
for release_info in $releases; do
if [ $i -eq 0 ]; then # Cheking tag_name
if echo "$release_info" | grep -q "$GREP_SEMVER_REGEXP"; then # If it's not an alpha or beta release
current_tag=$release_info
else
current_tag=""
current_tag=''
fi
i=1
elif [ $i -eq 1 ]; then # Checking draft boolean
if [ "$release_info" = "true" ]; then
current_tag=""
if [ "$release_info" = 'true' ]; then
current_tag=''
fi
i=2
elif [ $i -eq 2 ]; then # Checking prerelease boolean
if [ "$release_info" = "true" ]; then
current_tag=""
if [ "$release_info" = 'true' ]; then
current_tag=''
fi
i=0
if [ "$current_tag" != "" ]; then # If the current_tag is valid
if [ "$latest" = "" ]; then # If there is no latest yet
if [ "$current_tag" != '' ]; then # If the current_tag is valid
if [ "$latest" = '' ]; then # If there is no latest yet
latest="$current_tag"
else
semverLT $current_tag $latest # Comparing latest and the current tag
@@ -140,7 +148,7 @@ get_os() {
get_archi() {
architecture=$(uname -m)
case "$architecture" in
'x86_64' | 'amd64')
'x86_64' | 'amd64' | 'arm64')
archi='amd64'
;;
'aarch64')
@@ -153,7 +161,7 @@ get_archi() {
}
success_usage() {
printf "$GREEN%s\n$DEFAULT" "MeiliSearch binary successfully downloaded as '$BINARY_NAME' file."
printf "$GREEN%s\n$DEFAULT" "MeiliSearch $latest binary successfully downloaded as '$binary_name' file."
echo ''
echo 'Run it:'
echo ' $ ./meilisearch'
@@ -171,6 +179,13 @@ failure_usage() {
# MAIN
latest="$(get_latest)"
if [ "$latest" = '' ]; then
echo ''
echo 'Impossible to get the latest stable version of MeiliSearch.'
echo 'Please let us know about this issue: https://github.com/meilisearch/MeiliSearch/issues/new/choose'
exit 1
fi
if ! get_os; then
failure_usage
exit 1
@@ -185,16 +200,16 @@ echo "Downloading MeiliSearch binary $latest for $os, architecture $archi..."
case "$os" in
'windows')
release_file="meilisearch-$os-$archi.exe"
BINARY_NAME='meilisearch.exe'
binary_name='meilisearch.exe'
;;
*)
*)
release_file="meilisearch-$os-$archi"
BINARY_NAME='meilisearch'
binary_name='meilisearch'
esac
link="https://github.com/meilisearch/MeiliSearch/releases/download/$latest/$release_file"
curl -OL "$link"
mv "$release_file" "$BINARY_NAME"
chmod 744 "$BINARY_NAME"
mv "$release_file" "$binary_name"
chmod 744 "$binary_name"
success_usage

View File

@@ -0,0 +1,15 @@
[package]
name = "meilisearch-auth"
version = "0.25.0"
edition = "2018"
[dependencies]
enum-iterator = "0.7.0"
heed = { git = "https://github.com/Kerollmops/heed", tag = "v0.12.1" }
sha2 = "0.9.6"
chrono = { version = "0.4.19", features = ["serde"] }
meilisearch-error = { path = "../meilisearch-error" }
serde_json = { version = "1.0.67", features = ["preserve_order"] }
rand = "0.8.4"
serde = { version = "1.0.130", features = ["derive"] }
thiserror = "1.0.28"

View File

@@ -0,0 +1,104 @@
use enum_iterator::IntoEnumIterator;
use serde::{Deserialize, Serialize};
#[derive(IntoEnumIterator, Copy, Clone, Serialize, Deserialize, Debug, Eq, PartialEq)]
#[repr(u8)]
pub enum Action {
#[serde(rename = "*")]
All = 0,
#[serde(rename = "search")]
Search = actions::SEARCH,
#[serde(rename = "documents.add")]
DocumentsAdd = actions::DOCUMENTS_ADD,
#[serde(rename = "documents.get")]
DocumentsGet = actions::DOCUMENTS_GET,
#[serde(rename = "documents.delete")]
DocumentsDelete = actions::DOCUMENTS_DELETE,
#[serde(rename = "indexes.create")]
IndexesAdd = actions::INDEXES_CREATE,
#[serde(rename = "indexes.get")]
IndexesGet = actions::INDEXES_GET,
#[serde(rename = "indexes.update")]
IndexesUpdate = actions::INDEXES_UPDATE,
#[serde(rename = "indexes.delete")]
IndexesDelete = actions::INDEXES_DELETE,
#[serde(rename = "tasks.get")]
TasksGet = actions::TASKS_GET,
#[serde(rename = "settings.get")]
SettingsGet = actions::SETTINGS_GET,
#[serde(rename = "settings.update")]
SettingsUpdate = actions::SETTINGS_UPDATE,
#[serde(rename = "stats.get")]
StatsGet = actions::STATS_GET,
#[serde(rename = "dumps.create")]
DumpsCreate = actions::DUMPS_CREATE,
#[serde(rename = "dumps.get")]
DumpsGet = actions::DUMPS_GET,
#[serde(rename = "version")]
Version = actions::VERSION,
}
impl Action {
pub fn from_repr(repr: u8) -> Option<Self> {
use actions::*;
match repr {
0 => Some(Self::All),
SEARCH => Some(Self::Search),
DOCUMENTS_ADD => Some(Self::DocumentsAdd),
DOCUMENTS_GET => Some(Self::DocumentsGet),
DOCUMENTS_DELETE => Some(Self::DocumentsDelete),
INDEXES_CREATE => Some(Self::IndexesAdd),
INDEXES_GET => Some(Self::IndexesGet),
INDEXES_UPDATE => Some(Self::IndexesUpdate),
INDEXES_DELETE => Some(Self::IndexesDelete),
TASKS_GET => Some(Self::TasksGet),
SETTINGS_GET => Some(Self::SettingsGet),
SETTINGS_UPDATE => Some(Self::SettingsUpdate),
STATS_GET => Some(Self::StatsGet),
DUMPS_CREATE => Some(Self::DumpsCreate),
DUMPS_GET => Some(Self::DumpsGet),
VERSION => Some(Self::Version),
_otherwise => None,
}
}
pub fn repr(&self) -> u8 {
use actions::*;
match self {
Self::All => 0,
Self::Search => SEARCH,
Self::DocumentsAdd => DOCUMENTS_ADD,
Self::DocumentsGet => DOCUMENTS_GET,
Self::DocumentsDelete => DOCUMENTS_DELETE,
Self::IndexesAdd => INDEXES_CREATE,
Self::IndexesGet => INDEXES_GET,
Self::IndexesUpdate => INDEXES_UPDATE,
Self::IndexesDelete => INDEXES_DELETE,
Self::TasksGet => TASKS_GET,
Self::SettingsGet => SETTINGS_GET,
Self::SettingsUpdate => SETTINGS_UPDATE,
Self::StatsGet => STATS_GET,
Self::DumpsCreate => DUMPS_CREATE,
Self::DumpsGet => DUMPS_GET,
Self::Version => VERSION,
}
}
}
pub mod actions {
pub const SEARCH: u8 = 1;
pub const DOCUMENTS_ADD: u8 = 2;
pub const DOCUMENTS_GET: u8 = 3;
pub const DOCUMENTS_DELETE: u8 = 4;
pub const INDEXES_CREATE: u8 = 5;
pub const INDEXES_GET: u8 = 6;
pub const INDEXES_UPDATE: u8 = 7;
pub const INDEXES_DELETE: u8 = 8;
pub const TASKS_GET: u8 = 9;
pub const SETTINGS_GET: u8 = 10;
pub const SETTINGS_UPDATE: u8 = 11;
pub const STATS_GET: u8 = 12;
pub const DUMPS_CREATE: u8 = 13;
pub const DUMPS_GET: u8 = 14;
pub const VERSION: u8 = 15;
}

View File

@@ -0,0 +1,44 @@
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::io::Write;
use std::path::Path;
use crate::{AuthController, HeedAuthStore, Result};
const KEYS_PATH: &str = "keys";
impl AuthController {
pub fn dump(src: impl AsRef<Path>, dst: impl AsRef<Path>) -> Result<()> {
let store = HeedAuthStore::new(&src)?;
let keys_file_path = dst.as_ref().join(KEYS_PATH);
let keys = store.list_api_keys()?;
let mut keys_file = File::create(&keys_file_path)?;
for key in keys {
serde_json::to_writer(&mut keys_file, &key)?;
keys_file.write_all(b"\n")?;
}
Ok(())
}
pub fn load_dump(src: impl AsRef<Path>, dst: impl AsRef<Path>) -> Result<()> {
let store = HeedAuthStore::new(&dst)?;
let keys_file_path = src.as_ref().join(KEYS_PATH);
if !keys_file_path.exists() {
return Ok(());
}
let mut reader = BufReader::new(File::open(&keys_file_path)?).lines();
while let Some(key) = reader.next().transpose()? {
let key = serde_json::from_str(&key)?;
store.put_api_key(key)?;
}
Ok(())
}
}

View File

@@ -0,0 +1,46 @@
use std::error::Error;
use meilisearch_error::ErrorCode;
use meilisearch_error::{internal_error, Code};
use serde_json::Value;
pub type Result<T> = std::result::Result<T, AuthControllerError>;
#[derive(Debug, thiserror::Error)]
pub enum AuthControllerError {
#[error("`{0}` field is mandatory.")]
MissingParameter(&'static str),
#[error("actions field value `{0}` is invalid. It should be an array of string representing action names.")]
InvalidApiKeyActions(Value),
#[error("indexes field value `{0}` is invalid. It should be an array of string representing index names.")]
InvalidApiKeyIndexes(Value),
#[error("expiresAt field value `{0}` is invalid. It should be in ISO-8601 format to represents a date or datetime in the future or specified as a null value. e.g. 'YYYY-MM-DD' or 'YYYY-MM-DDTHH:MM:SS'.")]
InvalidApiKeyExpiresAt(Value),
#[error("description field value `{0}` is invalid. It should be a string or specified as a null value.")]
InvalidApiKeyDescription(Value),
#[error("API key `{0}` not found.")]
ApiKeyNotFound(String),
#[error("Internal error: {0}")]
Internal(Box<dyn Error + Send + Sync + 'static>),
}
internal_error!(
AuthControllerError: heed::Error,
std::io::Error,
serde_json::Error,
std::str::Utf8Error
);
impl ErrorCode for AuthControllerError {
fn error_code(&self) -> Code {
match self {
Self::MissingParameter(_) => Code::MissingParameter,
Self::InvalidApiKeyActions(_) => Code::InvalidApiKeyActions,
Self::InvalidApiKeyIndexes(_) => Code::InvalidApiKeyIndexes,
Self::InvalidApiKeyExpiresAt(_) => Code::InvalidApiKeyExpiresAt,
Self::InvalidApiKeyDescription(_) => Code::InvalidApiKeyDescription,
Self::ApiKeyNotFound(_) => Code::ApiKeyNotFound,
Self::Internal(_) => Code::Internal,
}
}
}

161
meilisearch-auth/src/key.rs Normal file
View File

@@ -0,0 +1,161 @@
use crate::action::Action;
use crate::error::{AuthControllerError, Result};
use crate::store::{KeyId, KEY_ID_LENGTH};
use chrono::{DateTime, NaiveDate, NaiveDateTime, Utc};
use rand::Rng;
use serde::{Deserialize, Serialize};
use serde_json::{from_value, Value};
#[derive(Debug, Deserialize, Serialize)]
pub struct Key {
#[serde(skip_serializing_if = "Option::is_none")]
pub description: Option<String>,
pub id: KeyId,
pub actions: Vec<Action>,
pub indexes: Vec<String>,
pub expires_at: Option<DateTime<Utc>>,
pub created_at: DateTime<Utc>,
pub updated_at: DateTime<Utc>,
}
impl Key {
pub fn create_from_value(value: Value) -> Result<Self> {
let description = value
.get("description")
.map(|des| {
from_value(des.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyDescription(des.clone()))
})
.transpose()?;
let id = generate_id();
let actions = value
.get("actions")
.map(|act| {
from_value(act.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyActions(act.clone()))
})
.ok_or(AuthControllerError::MissingParameter("actions"))??;
let indexes = value
.get("indexes")
.map(|ind| {
from_value(ind.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyIndexes(ind.clone()))
})
.ok_or(AuthControllerError::MissingParameter("indexes"))??;
let expires_at = value
.get("expiresAt")
.map(parse_expiration_date)
.ok_or(AuthControllerError::MissingParameter("expiresAt"))??;
let created_at = Utc::now();
let updated_at = Utc::now();
Ok(Self {
description,
id,
actions,
indexes,
expires_at,
created_at,
updated_at,
})
}
pub fn update_from_value(&mut self, value: Value) -> Result<()> {
if let Some(des) = value.get("description") {
let des = from_value(des.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyDescription(des.clone()));
self.description = des?;
}
if let Some(act) = value.get("actions") {
let act = from_value(act.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyActions(act.clone()));
self.actions = act?;
}
if let Some(ind) = value.get("indexes") {
let ind = from_value(ind.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyIndexes(ind.clone()));
self.indexes = ind?;
}
if let Some(exp) = value.get("expiresAt") {
self.expires_at = parse_expiration_date(exp)?;
}
self.updated_at = Utc::now();
Ok(())
}
pub(crate) fn default_admin() -> Self {
Self {
description: Some("Default Admin API Key (Use it for all other operations. Caution! Do not use it on a public frontend)".to_string()),
id: generate_id(),
actions: vec![Action::All],
indexes: vec!["*".to_string()],
expires_at: None,
created_at: Utc::now(),
updated_at: Utc::now(),
}
}
pub(crate) fn default_search() -> Self {
Self {
description: Some(
"Default Search API Key (Use it to search from the frontend)".to_string(),
),
id: generate_id(),
actions: vec![Action::Search],
indexes: vec!["*".to_string()],
expires_at: None,
created_at: Utc::now(),
updated_at: Utc::now(),
}
}
}
/// Generate a printable key of 64 characters using thread_rng.
fn generate_id() -> [u8; KEY_ID_LENGTH] {
const CHARSET: &[u8] = b"abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
let mut rng = rand::thread_rng();
let mut bytes = [0; KEY_ID_LENGTH];
for byte in bytes.iter_mut() {
*byte = CHARSET[rng.gen_range(0..CHARSET.len())];
}
bytes
}
fn parse_expiration_date(value: &Value) -> Result<Option<DateTime<Utc>>> {
match value {
Value::String(string) => DateTime::parse_from_rfc3339(string)
.map(|d| d.into())
.or_else(|_| {
NaiveDateTime::parse_from_str(string, "%Y-%m-%dT%H:%M:%S")
.map(|naive| DateTime::from_utc(naive, Utc))
})
.or_else(|_| {
NaiveDate::parse_from_str(string, "%Y-%m-%d")
.map(|naive| DateTime::from_utc(naive.and_hms(0, 0, 0), Utc))
})
.map_err(|_| AuthControllerError::InvalidApiKeyExpiresAt(value.clone()))
// check if the key is already expired.
.and_then(|d| {
if d > Utc::now() {
Ok(d)
} else {
Err(AuthControllerError::InvalidApiKeyExpiresAt(value.clone()))
}
})
.map(Option::Some),
Value::Null => Ok(None),
_otherwise => Err(AuthControllerError::InvalidApiKeyExpiresAt(value.clone())),
}
}

151
meilisearch-auth/src/lib.rs Normal file
View File

@@ -0,0 +1,151 @@
mod action;
mod dump;
pub mod error;
mod key;
mod store;
use std::path::Path;
use std::str::from_utf8;
use std::sync::Arc;
use chrono::Utc;
use serde_json::Value;
use sha2::{Digest, Sha256};
pub use action::{actions, Action};
use error::{AuthControllerError, Result};
pub use key::Key;
use store::HeedAuthStore;
#[derive(Clone)]
pub struct AuthController {
store: Arc<HeedAuthStore>,
master_key: Option<String>,
}
impl AuthController {
pub fn new(db_path: impl AsRef<Path>, master_key: &Option<String>) -> Result<Self> {
let store = HeedAuthStore::new(db_path)?;
if store.is_empty()? {
generate_default_keys(&store)?;
}
Ok(Self {
store: Arc::new(store),
master_key: master_key.clone(),
})
}
pub async fn create_key(&self, value: Value) -> Result<Key> {
let key = Key::create_from_value(value)?;
self.store.put_api_key(key)
}
pub async fn update_key(&self, key: impl AsRef<str>, value: Value) -> Result<Key> {
let mut key = self.get_key(key).await?;
key.update_from_value(value)?;
self.store.put_api_key(key)
}
pub async fn get_key(&self, key: impl AsRef<str>) -> Result<Key> {
self.store
.get_api_key(&key)?
.ok_or_else(|| AuthControllerError::ApiKeyNotFound(key.as_ref().to_string()))
}
pub fn get_key_filters(&self, key: impl AsRef<str>) -> Result<AuthFilter> {
let mut filters = AuthFilter::default();
if self
.master_key
.as_ref()
.map_or(false, |master_key| master_key != key.as_ref())
{
let key = self
.store
.get_api_key(&key)?
.ok_or_else(|| AuthControllerError::ApiKeyNotFound(key.as_ref().to_string()))?;
if !key.indexes.iter().any(|i| i.as_str() == "*") {
filters.indexes = Some(key.indexes);
}
filters.allow_index_creation = key
.actions
.iter()
.any(|&action| action == Action::IndexesAdd || action == Action::All);
}
Ok(filters)
}
pub async fn list_keys(&self) -> Result<Vec<Key>> {
self.store.list_api_keys()
}
pub async fn delete_key(&self, key: impl AsRef<str>) -> Result<()> {
if self.store.delete_api_key(&key)? {
Ok(())
} else {
Err(AuthControllerError::ApiKeyNotFound(
key.as_ref().to_string(),
))
}
}
pub fn get_master_key(&self) -> Option<&String> {
self.master_key.as_ref()
}
pub fn authenticate(&self, token: &[u8], action: Action, index: Option<&[u8]>) -> Result<bool> {
if let Some(master_key) = &self.master_key {
if let Some((id, exp)) = self
.store
// check if the key has access to all indexes.
.get_expiration_date(token, action, None)?
.or(match index {
// else check if the key has access to the requested index.
Some(index) => self.store.get_expiration_date(token, action, Some(index))?,
// or to any index if no index has been requested.
None => self.store.prefix_first_expiration_date(token, action)?,
})
{
let id = from_utf8(&id)?;
if exp.map_or(true, |exp| Utc::now() < exp)
&& generate_key(master_key.as_bytes(), id).as_bytes() == token
{
return Ok(true);
}
}
}
Ok(false)
}
}
pub struct AuthFilter {
pub indexes: Option<Vec<String>>,
pub allow_index_creation: bool,
}
impl Default for AuthFilter {
fn default() -> Self {
Self {
indexes: None,
allow_index_creation: true,
}
}
}
pub fn generate_key(master_key: &[u8], uid: &str) -> String {
let key = [uid.as_bytes(), master_key].concat();
let sha = Sha256::digest(&key);
format!("{}{:x}", uid, sha)
}
fn generate_default_keys(store: &HeedAuthStore) -> Result<()> {
store.put_api_key(Key::default_admin())?;
store.put_api_key(Key::default_search())?;
Ok(())
}

View File

@@ -0,0 +1,236 @@
use enum_iterator::IntoEnumIterator;
use std::borrow::Cow;
use std::cmp::Reverse;
use std::convert::TryFrom;
use std::convert::TryInto;
use std::fs::create_dir_all;
use std::path::Path;
use std::str;
use chrono::{DateTime, Utc};
use heed::types::{ByteSlice, DecodeIgnore, SerdeJson};
use heed::{Database, Env, EnvOpenOptions, RwTxn};
use super::error::Result;
use super::{Action, Key};
const AUTH_STORE_SIZE: usize = 1_073_741_824; //1GiB
pub const KEY_ID_LENGTH: usize = 8;
const AUTH_DB_PATH: &str = "auth";
const KEY_DB_NAME: &str = "api-keys";
const KEY_ID_ACTION_INDEX_EXPIRATION_DB_NAME: &str = "keyid-action-index-expiration";
pub type KeyId = [u8; KEY_ID_LENGTH];
#[derive(Clone)]
pub struct HeedAuthStore {
env: Env,
keys: Database<ByteSlice, SerdeJson<Key>>,
action_keyid_index_expiration: Database<KeyIdActionCodec, SerdeJson<Option<DateTime<Utc>>>>,
}
impl HeedAuthStore {
pub fn new(path: impl AsRef<Path>) -> Result<Self> {
let path = path.as_ref().join(AUTH_DB_PATH);
create_dir_all(&path)?;
let mut options = EnvOpenOptions::new();
options.map_size(AUTH_STORE_SIZE); // 1GB
options.max_dbs(2);
let env = options.open(path)?;
let keys = env.create_database(Some(KEY_DB_NAME))?;
let action_keyid_index_expiration =
env.create_database(Some(KEY_ID_ACTION_INDEX_EXPIRATION_DB_NAME))?;
Ok(Self {
env,
keys,
action_keyid_index_expiration,
})
}
pub fn is_empty(&self) -> Result<bool> {
let rtxn = self.env.read_txn()?;
Ok(self.keys.len(&rtxn)? == 0)
}
pub fn put_api_key(&self, key: Key) -> Result<Key> {
let mut wtxn = self.env.write_txn()?;
self.keys.put(&mut wtxn, &key.id, &key)?;
let id = key.id;
// delete key from inverted database before refilling it.
self.delete_key_from_inverted_db(&mut wtxn, &id)?;
// create inverted database.
let db = self.action_keyid_index_expiration;
let actions = if key.actions.contains(&Action::All) {
// if key.actions contains All, we iterate over all actions.
Action::into_enum_iter().collect()
} else {
key.actions.clone()
};
let no_index_restriction = key.indexes.contains(&"*".to_owned());
for action in actions {
if no_index_restriction {
// If there is no index restriction we put None.
db.put(&mut wtxn, &(&id, &action, None), &key.expires_at)?;
} else {
// else we create a key for each index.
for index in key.indexes.iter() {
db.put(
&mut wtxn,
&(&id, &action, Some(index.as_bytes())),
&key.expires_at,
)?;
}
}
}
wtxn.commit()?;
Ok(key)
}
pub fn get_api_key(&self, key: impl AsRef<str>) -> Result<Option<Key>> {
let rtxn = self.env.read_txn()?;
match try_split_array_at::<_, KEY_ID_LENGTH>(key.as_ref().as_bytes()) {
Some((id, _)) => self.keys.get(&rtxn, id).map_err(|e| e.into()),
None => Ok(None),
}
}
pub fn delete_api_key(&self, key: impl AsRef<str>) -> Result<bool> {
let mut wtxn = self.env.write_txn()?;
let existing = match try_split_array_at(key.as_ref().as_bytes()) {
Some((id, _)) => {
let existing = self.keys.delete(&mut wtxn, id)?;
self.delete_key_from_inverted_db(&mut wtxn, id)?;
existing
}
None => false,
};
wtxn.commit()?;
Ok(existing)
}
pub fn list_api_keys(&self) -> Result<Vec<Key>> {
let mut list = Vec::new();
let rtxn = self.env.read_txn()?;
for result in self.keys.remap_key_type::<DecodeIgnore>().iter(&rtxn)? {
let (_, content) = result?;
list.push(content);
}
list.sort_unstable_by_key(|k| Reverse(k.created_at));
Ok(list)
}
pub fn get_expiration_date(
&self,
key: &[u8],
action: Action,
index: Option<&[u8]>,
) -> Result<Option<(KeyId, Option<DateTime<Utc>>)>> {
let rtxn = self.env.read_txn()?;
match try_split_array_at::<_, KEY_ID_LENGTH>(key) {
Some((id, _)) => {
let tuple = (id, &action, index);
Ok(self
.action_keyid_index_expiration
.get(&rtxn, &tuple)?
.map(|expiration| (*id, expiration)))
}
None => Ok(None),
}
}
pub fn prefix_first_expiration_date(
&self,
key: &[u8],
action: Action,
) -> Result<Option<(KeyId, Option<DateTime<Utc>>)>> {
let rtxn = self.env.read_txn()?;
match try_split_array_at::<_, KEY_ID_LENGTH>(key) {
Some((id, _)) => {
let tuple = (id, &action, None);
Ok(self
.action_keyid_index_expiration
.prefix_iter(&rtxn, &tuple)?
.next()
.transpose()?
.map(|(_, expiration)| (*id, expiration)))
}
None => Ok(None),
}
}
fn delete_key_from_inverted_db(&self, wtxn: &mut RwTxn, key: &KeyId) -> Result<()> {
let mut iter = self
.action_keyid_index_expiration
.remap_types::<ByteSlice, DecodeIgnore>()
.prefix_iter_mut(wtxn, key)?;
while iter.next().transpose()?.is_some() {
// safety: we don't keep references from inside the LMDB database.
unsafe { iter.del_current()? };
}
Ok(())
}
}
/// Codec allowing to retrieve the expiration date of an action,
/// optionnally on a spcific index, for a given key.
pub struct KeyIdActionCodec;
impl<'a> heed::BytesDecode<'a> for KeyIdActionCodec {
type DItem = (KeyId, Action, Option<&'a [u8]>);
fn bytes_decode(bytes: &'a [u8]) -> Option<Self::DItem> {
let (key_id, action_bytes) = try_split_array_at(bytes)?;
let (action_bytes, index) = match try_split_array_at(action_bytes)? {
(action, []) => (action, None),
(action, index) => (action, Some(index)),
};
let action = Action::from_repr(u8::from_be_bytes(*action_bytes))?;
Some((*key_id, action, index))
}
}
impl<'a> heed::BytesEncode<'a> for KeyIdActionCodec {
type EItem = (&'a KeyId, &'a Action, Option<&'a [u8]>);
fn bytes_encode((key_id, action, index): &Self::EItem) -> Option<Cow<[u8]>> {
let mut bytes = Vec::new();
bytes.extend_from_slice(*key_id);
let action_bytes = u8::to_be_bytes(action.repr());
bytes.extend_from_slice(&action_bytes);
if let Some(index) = index {
bytes.extend_from_slice(index);
}
Some(Cow::Owned(bytes))
}
}
/// Divides one slice into two at an index, returns `None` if mid is out of bounds.
pub fn try_split_at<T>(slice: &[T], mid: usize) -> Option<(&[T], &[T])> {
if mid <= slice.len() {
Some(slice.split_at(mid))
} else {
None
}
}
/// Divides one slice into an array and the tail at an index,
/// returns `None` if `N` is out of bounds.
pub fn try_split_array_at<T, const N: usize>(slice: &[T]) -> Option<(&[T; N], &[T])>
where
[T; N]: for<'a> TryFrom<&'a [T]>,
{
let (head, tail) = try_split_at(slice, N)?;
let head = head.try_into().ok()?;
Some((head, tail))
}

View File

@@ -1,8 +1,16 @@
[package]
name = "meilisearch-error"
version = "0.22.0"
version = "0.25.2"
authors = ["marin <postma.marin@protonmail.com>"]
edition = "2018"
[dependencies]
actix-http = "=3.0.0-beta.6"
actix-http = "=3.0.0-beta.10"
actix-web = "4.0.0-beta.9"
proptest = { version = "1.0.0", optional = true }
proptest-derive = { version = "0.3.0", optional = true }
serde = { version = "1.0.130", features = ["derive"] }
serde_json = "1.0.69"
[features]
test-traits = ["proptest", "proptest-derive"]

View File

@@ -1,6 +1,75 @@
use std::fmt;
use actix_http::http::StatusCode;
use actix_http::{body::Body, http::StatusCode};
use actix_web::{self as aweb, HttpResponseBuilder};
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
#[serde(rename_all = "camelCase")]
#[cfg_attr(feature = "test-traits", derive(proptest_derive::Arbitrary))]
pub struct ResponseError {
#[serde(skip)]
#[cfg_attr(
feature = "test-traits",
proptest(strategy = "strategy::status_code_strategy()")
)]
code: StatusCode,
message: String,
#[serde(rename = "code")]
error_code: String,
#[serde(rename = "type")]
error_type: String,
#[serde(rename = "link")]
error_link: String,
}
impl ResponseError {
pub fn from_msg(message: String, code: Code) -> Self {
Self {
code: code.http(),
message,
error_code: code.err_code().error_name.to_string(),
error_type: code.type_(),
error_link: code.url(),
}
}
}
impl fmt::Display for ResponseError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.message.fmt(f)
}
}
impl std::error::Error for ResponseError {}
impl<T> From<T> for ResponseError
where
T: ErrorCode,
{
fn from(other: T) -> Self {
Self {
code: other.http_status(),
message: other.to_string(),
error_code: other.error_name(),
error_type: other.error_type(),
error_link: other.error_url(),
}
}
}
impl aweb::error::ResponseError for ResponseError {
fn error_response(&self) -> aweb::HttpResponse<Body> {
let json = serde_json::to_vec(self).unwrap();
HttpResponseBuilder::new(self.status_code())
.content_type("application/json")
.body(json)
}
fn status_code(&self) -> StatusCode {
self.code
}
}
pub trait ErrorCode: std::error::Error {
fn error_code(&self) -> Code;
@@ -38,20 +107,20 @@ impl fmt::Display for ErrorType {
use ErrorType::*;
match self {
InternalError => write!(f, "internal_error"),
InvalidRequestError => write!(f, "invalid_request_error"),
AuthenticationError => write!(f, "authentication_error"),
InternalError => write!(f, "internal"),
InvalidRequestError => write!(f, "invalid_request"),
AuthenticationError => write!(f, "auth"),
}
}
}
#[derive(Serialize, Deserialize, Debug, Clone, Copy)]
pub enum Code {
// index related error
CreateIndex,
IndexAlreadyExists,
IndexNotFound,
InvalidIndexUid,
OpenIndex,
// invalid state error
InvalidState,
@@ -60,18 +129,24 @@ pub enum Code {
MaxFieldsLimitExceeded,
MissingDocumentId,
InvalidDocumentId,
Facet,
Filter,
Sort,
BadParameter,
BadRequest,
DatabaseSizeLimitReached,
DocumentNotFound,
Internal,
InvalidGeoField,
InvalidRankingRule,
InvalidStore,
InvalidToken,
MissingAuthorizationHeader,
NotFound,
NoSpaceLeftOnDevice,
DumpNotFound,
TaskNotFound,
PayloadTooLarge,
RetrieveDocument,
SearchDocuments,
@@ -79,6 +154,18 @@ pub enum Code {
DumpAlreadyInProgress,
DumpProcessFailed,
InvalidContentType,
MissingContentType,
MalformedPayload,
MissingPayload,
ApiKeyNotFound,
MissingParameter,
InvalidApiKeyActions,
InvalidApiKeyIndexes,
InvalidApiKeyExpiresAt,
InvalidApiKeyDescription,
}
impl Code {
@@ -89,22 +176,30 @@ impl Code {
match self {
// index related errors
// create index is thrown on internal error while creating an index.
CreateIndex => ErrCode::internal("index_creation_failed", StatusCode::BAD_REQUEST),
IndexAlreadyExists => ErrCode::invalid("index_already_exists", StatusCode::BAD_REQUEST),
CreateIndex => {
ErrCode::internal("index_creation_failed", StatusCode::INTERNAL_SERVER_ERROR)
}
IndexAlreadyExists => ErrCode::invalid("index_already_exists", StatusCode::CONFLICT),
// thrown when requesting an unexisting index
IndexNotFound => ErrCode::invalid("index_not_found", StatusCode::NOT_FOUND),
InvalidIndexUid => ErrCode::invalid("invalid_index_uid", StatusCode::BAD_REQUEST),
OpenIndex => {
ErrCode::internal("index_not_accessible", StatusCode::INTERNAL_SERVER_ERROR)
}
// invalid state error
InvalidState => ErrCode::internal("invalid_state", StatusCode::INTERNAL_SERVER_ERROR),
// thrown when no primary key has been set
MissingPrimaryKey => ErrCode::invalid("missing_primary_key", StatusCode::BAD_REQUEST),
MissingPrimaryKey => {
ErrCode::invalid("primary_key_inference_failed", StatusCode::BAD_REQUEST)
}
// error thrown when trying to set an already existing primary key
PrimaryKeyAlreadyPresent => {
ErrCode::invalid("primary_key_already_present", StatusCode::BAD_REQUEST)
ErrCode::invalid("index_primary_key_already_exists", StatusCode::BAD_REQUEST)
}
// invalid ranking rule
InvalidRankingRule => ErrCode::invalid("invalid_ranking_rule", StatusCode::BAD_REQUEST),
// invalid database
InvalidStore => {
ErrCode::internal("invalid_store_file", StatusCode::INTERNAL_SERVER_ERROR)
}
// invalid document
@@ -112,9 +207,8 @@ impl Code {
ErrCode::invalid("max_fields_limit_exceeded", StatusCode::BAD_REQUEST)
}
MissingDocumentId => ErrCode::invalid("missing_document_id", StatusCode::BAD_REQUEST),
InvalidDocumentId => ErrCode::invalid("invalid_document_id", StatusCode::BAD_REQUEST),
// error related to facets
Facet => ErrCode::invalid("invalid_facet", StatusCode::BAD_REQUEST),
// error related to filters
Filter => ErrCode::invalid("invalid_filter", StatusCode::BAD_REQUEST),
// error related to sorts
@@ -122,13 +216,22 @@ impl Code {
BadParameter => ErrCode::invalid("bad_parameter", StatusCode::BAD_REQUEST),
BadRequest => ErrCode::invalid("bad_request", StatusCode::BAD_REQUEST),
DatabaseSizeLimitReached => ErrCode::internal(
"database_size_limit_reached",
StatusCode::INTERNAL_SERVER_ERROR,
),
DocumentNotFound => ErrCode::invalid("document_not_found", StatusCode::NOT_FOUND),
Internal => ErrCode::internal("internal", StatusCode::INTERNAL_SERVER_ERROR),
InvalidToken => ErrCode::authentication("invalid_token", StatusCode::FORBIDDEN),
InvalidGeoField => ErrCode::invalid("invalid_geo_field", StatusCode::BAD_REQUEST),
InvalidToken => ErrCode::authentication("invalid_api_key", StatusCode::FORBIDDEN),
MissingAuthorizationHeader => {
ErrCode::authentication("missing_authorization_header", StatusCode::UNAUTHORIZED)
}
NotFound => ErrCode::invalid("not_found", StatusCode::NOT_FOUND),
TaskNotFound => ErrCode::invalid("task_not_found", StatusCode::NOT_FOUND),
DumpNotFound => ErrCode::invalid("dump_not_found", StatusCode::NOT_FOUND),
NoSpaceLeftOnDevice => {
ErrCode::internal("no_space_left_on_device", StatusCode::INTERNAL_SERVER_ERROR)
}
PayloadTooLarge => ErrCode::invalid("payload_too_large", StatusCode::PAYLOAD_TOO_LARGE),
RetrieveDocument => {
ErrCode::internal("unretrievable_document", StatusCode::BAD_REQUEST)
@@ -140,11 +243,35 @@ impl Code {
// error related to dump
DumpAlreadyInProgress => {
ErrCode::invalid("dump_already_in_progress", StatusCode::CONFLICT)
ErrCode::invalid("dump_already_processing", StatusCode::CONFLICT)
}
DumpProcessFailed => {
ErrCode::internal("dump_process_failed", StatusCode::INTERNAL_SERVER_ERROR)
}
MissingContentType => {
ErrCode::invalid("missing_content_type", StatusCode::UNSUPPORTED_MEDIA_TYPE)
}
MalformedPayload => ErrCode::invalid("malformed_payload", StatusCode::BAD_REQUEST),
InvalidContentType => {
ErrCode::invalid("invalid_content_type", StatusCode::UNSUPPORTED_MEDIA_TYPE)
}
MissingPayload => ErrCode::invalid("missing_payload", StatusCode::BAD_REQUEST),
// error related to keys
ApiKeyNotFound => ErrCode::invalid("api_key_not_found", StatusCode::NOT_FOUND),
MissingParameter => ErrCode::invalid("missing_parameter", StatusCode::BAD_REQUEST),
InvalidApiKeyActions => {
ErrCode::invalid("invalid_api_key_actions", StatusCode::BAD_REQUEST)
}
InvalidApiKeyIndexes => {
ErrCode::invalid("invalid_api_key_indexes", StatusCode::BAD_REQUEST)
}
InvalidApiKeyExpiresAt => {
ErrCode::invalid("invalid_api_key_expires_at", StatusCode::BAD_REQUEST)
}
InvalidApiKeyDescription => {
ErrCode::invalid("invalid_api_key_description", StatusCode::BAD_REQUEST)
}
}
}
@@ -201,3 +328,27 @@ impl ErrCode {
}
}
}
#[cfg(feature = "test-traits")]
mod strategy {
use proptest::strategy::Strategy;
use super::*;
pub(super) fn status_code_strategy() -> impl Strategy<Value = StatusCode> {
(100..999u16).prop_map(|i| StatusCode::from_u16(i).unwrap())
}
}
#[macro_export]
macro_rules! internal_error {
($target:ty : $($other:path), *) => {
$(
impl From<$other> for $target {
fn from(other: $other) -> Self {
Self::Internal(Box::new(other))
}
}
)*
}
}

View File

@@ -4,88 +4,86 @@ description = "MeiliSearch HTTP server"
edition = "2018"
license = "MIT"
name = "meilisearch-http"
version = "0.22.0"
version = "0.25.2"
[[bin]]
name = "meilisearch"
path = "src/main.rs"
[build-dependencies]
actix-web-static-files = { git = "https://github.com/MarinPostma/actix-web-static-files.git", rev = "6db8c3e", optional = true }
anyhow = { version = "*", optional = true }
cargo_toml = { version = "0.9.0", optional = true }
actix-web-static-files = { git = "https://github.com/MarinPostma/actix-web-static-files.git", rev = "39d8006", optional = true }
anyhow = { version = "1.0.43", optional = true }
cargo_toml = { version = "0.9", optional = true }
hex = { version = "0.4.3", optional = true }
reqwest = { version = "0.11.3", features = ["blocking", "rustls-tls"], default-features = false, optional = true }
sha-1 = { version = "0.9.4", optional = true }
tempfile = { version = "3.1.0", optional = true }
reqwest = { version = "0.11.4", features = ["blocking", "rustls-tls"], default-features = false, optional = true }
sha-1 = { version = "0.9.8", optional = true }
tempfile = { version = "3.2.0", optional = true }
vergen = { version = "5.1.15", default-features = false, features = ["git"] }
zip = { version = "0.5.12", optional = true }
zip = { version = "0.5.13", optional = true }
[dependencies]
actix-cors = { git = "https://github.com/MarinPostma/actix-extras.git", rev = "2dac1a4"}
actix-http = { version = "=3.0.0-beta.6" }
actix-service = "2.0.0"
actix-web = { version = "=4.0.0-beta.6", features = ["rustls"] }
actix-web-static-files = { git = "https://github.com/MarinPostma/actix-web-static-files.git", rev = "6db8c3e", optional = true }
anyhow = "1.0.36"
async-stream = "0.3.0"
async-trait = "0.1.42"
arc-swap = "1.2.0"
byte-unit = { version = "4.0.9", default-features = false, features = ["std"] }
bytes = "0.6.0"
actix-cors = { git = "https://github.com/MarinPostma/actix-extras.git", rev = "963ac94d" }
actix-web = { version = "4.0.0-beta.9", features = ["rustls"] }
actix-web-static-files = { git = "https://github.com/MarinPostma/actix-web-static-files.git", rev = "39d8006", optional = true }
# TODO: specifying this dependency so semver doesn't bump to next beta
actix-tls = "=3.0.0-beta.5"
anyhow = { version = "1.0.43", features = ["backtrace"] }
arc-swap = "1.3.2"
async-stream = "0.3.2"
async-trait = "0.1.51"
bstr = "0.2.17"
byte-unit = { version = "4.0.12", default-features = false, features = ["std"] }
bytes = "1.1.0"
chrono = { version = "0.4.19", features = ["serde"] }
crossbeam-channel = "0.5.0"
crossbeam-channel = "0.5.1"
either = "1.6.1"
env_logger = "0.8.2"
flate2 = "1.0.19"
fst = "0.4.5"
futures = "0.3.7"
futures-util = "0.3.8"
env_logger = "0.9.0"
flate2 = "1.0.21"
fst = "0.4.7"
futures = "0.3.17"
futures-util = "0.3.17"
heed = { git = "https://github.com/Kerollmops/heed", tag = "v0.12.1" }
http = "0.2.1"
indexmap = { version = "1.3.2", features = ["serde-1"] }
itertools = "0.10.0"
log = "0.4.8"
main_error = "0.1.0"
http = "0.2.4"
indexmap = { version = "1.7.0", features = ["serde-1"] }
itertools = "0.10.1"
log = "0.4.14"
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-error = { path = "../meilisearch-error" }
meilisearch-tokenizer = { git = "https://github.com/meilisearch/tokenizer.git", tag = "v0.2.5" }
memmap = "0.7.0"
milli = { git = "https://github.com/meilisearch/milli.git", tag = "v0.13.1" }
meilisearch-lib = { path = "../meilisearch-lib" }
mime = "0.3.16"
num_cpus = "1.13.0"
once_cell = "1.5.2"
parking_lot = "0.11.1"
rand = "0.7.3"
rayon = "1.5.0"
regex = "1.4.2"
rustls = "0.19"
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0.59", features = ["preserve_order"] }
sha2 = "0.9.1"
siphasher = "0.3.2"
slice-group-by = "0.2.6"
structopt = "0.3.20"
tar = "0.4.29"
tempfile = "3.1.0"
thiserror = "1.0.24"
tokio = { version = "1", features = ["full"] }
uuid = { version = "0.8.2", features = ["serde"] }
walkdir = "2.3.2"
obkv = "0.2.0"
pin-project = "1.0.7"
whoami = { version = "1.1.2", optional = true }
reqwest = { version = "0.11.3", features = ["json", "rustls-tls"], default-features = false, optional = true }
serdeval = "0.1.0"
sysinfo = "0.20.0"
once_cell = "1.8.0"
parking_lot = "0.11.2"
pin-project = "1.0.8"
platform-dirs = "0.3.0"
rand = "0.8.4"
rayon = "1.5.1"
regex = "1.5.4"
rustls = "0.19.1"
segment = { version = "0.1.2", optional = true }
serde = { version = "1.0.130", features = ["derive"] }
serde_json = { version = "1.0.67", features = ["preserve_order"] }
sha2 = "0.9.6"
siphasher = "0.3.7"
slice-group-by = "0.2.6"
structopt = "0.3.25"
sysinfo = "0.20.2"
tar = "0.4.37"
tempfile = "3.2.0"
thiserror = "1.0.28"
tokio = { version = "1.11.0", features = ["full"] }
tokio-stream = "0.1.7"
uuid = { version = "0.8.2", features = ["serde"] }
walkdir = "2.3.2"
[dev-dependencies]
actix-rt = "2.1.0"
assert-json-diff = { branch = "master", git = "https://github.com/qdequele/assert-json-diff" }
mockall = "0.9.1"
actix-rt = "2.2.0"
assert-json-diff = "2.0.1"
maplit = "1.0.2"
paste = "1.0.5"
serde_url_params = "0.2.1"
tempdir = "0.3.7"
urlencoding = "1.1.1"
urlencoding = "2.1.0"
[features]
mini-dashboard = [
@@ -98,12 +96,12 @@ mini-dashboard = [
"tempfile",
"zip",
]
analytics = ["whoami", "reqwest"]
analytics = ["segment"]
default = ["analytics", "mini-dashboard"]
[target.'cfg(target_os = "linux")'.dependencies]
jemallocator = "0.3.2"
tikv-jemallocator = "0.4.1"
[package.metadata.mini-dashboard]
assets-url = "https://github.com/meilisearch/mini-dashboard/releases/download/v0.1.4/build.zip"
sha1 = "750e8a8e56cfa61fbf9ead14b08a5f17ad3f3d37"
assets-url = "https://github.com/meilisearch/mini-dashboard/releases/download/v0.1.7/build.zip"
sha1 = "e2feedf271917c4b7b88998eff5aaaea1d3925b9"

View File

@@ -1,126 +0,0 @@
use std::hash::{Hash, Hasher};
use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
use log::debug;
use serde::Serialize;
use siphasher::sip::SipHasher;
use crate::Data;
use crate::Opt;
const AMPLITUDE_API_KEY: &str = "f7fba398780e06d8fe6666a9be7e3d47";
#[derive(Debug, Serialize)]
struct EventProperties {
database_size: u64,
last_update_timestamp: Option<i64>, //timestamp
number_of_documents: Vec<u64>,
}
impl EventProperties {
async fn from(data: Data) -> anyhow::Result<EventProperties> {
let stats = data.index_controller.get_all_stats().await?;
let database_size = stats.database_size;
let last_update_timestamp = stats.last_update.map(|u| u.timestamp());
let number_of_documents = stats
.indexes
.values()
.map(|index| index.number_of_documents)
.collect();
Ok(EventProperties {
database_size,
last_update_timestamp,
number_of_documents,
})
}
}
#[derive(Debug, Serialize)]
struct UserProperties<'a> {
env: &'a str,
start_since_days: u64,
user_email: Option<String>,
server_provider: Option<String>,
}
#[derive(Debug, Serialize)]
struct Event<'a> {
user_id: &'a str,
event_type: &'a str,
device_id: &'a str,
time: u64,
app_version: &'a str,
user_properties: UserProperties<'a>,
event_properties: Option<EventProperties>,
}
#[derive(Debug, Serialize)]
struct AmplitudeRequest<'a> {
api_key: &'a str,
events: Vec<Event<'a>>,
}
pub async fn analytics_sender(data: Data, opt: Opt) {
let username = whoami::username();
let hostname = whoami::hostname();
let platform = whoami::platform();
let uid = username + &hostname + &platform.to_string();
let mut hasher = SipHasher::new();
uid.hash(&mut hasher);
let hash = hasher.finish();
let uid = format!("{:X}", hash);
let platform = platform.to_string();
let first_start = Instant::now();
loop {
let n = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
let user_id = &uid;
let device_id = &platform;
let time = n.as_secs();
let event_type = "runtime_tick";
let elapsed_since_start = first_start.elapsed().as_secs() / 86_400; // One day
let event_properties = EventProperties::from(data.clone()).await.ok();
let app_version = env!("CARGO_PKG_VERSION").to_string();
let app_version = app_version.as_str();
let user_email = std::env::var("MEILI_USER_EMAIL").ok();
let server_provider = std::env::var("MEILI_SERVER_PROVIDER").ok();
let user_properties = UserProperties {
env: &opt.env,
start_since_days: elapsed_since_start,
user_email,
server_provider,
};
let event = Event {
user_id,
event_type,
device_id,
time,
app_version,
user_properties,
event_properties,
};
let request = AmplitudeRequest {
api_key: AMPLITUDE_API_KEY,
events: vec![event],
};
let response = reqwest::Client::new()
.post("https://api2.amplitude.com/2/httpapi")
.timeout(Duration::from_secs(60)) // 1 minute max
.json(&request)
.send()
.await;
if let Err(e) = response {
debug!("Unsuccessful call to Amplitude: {}", e);
}
tokio::time::sleep(Duration::from_secs(3600)).await;
}
}

View File

@@ -0,0 +1,51 @@
use std::{any::Any, sync::Arc};
use actix_web::HttpRequest;
use serde_json::Value;
use crate::{routes::indexes::documents::UpdateDocumentsQuery, Opt};
use super::{find_user_id, Analytics};
pub struct MockAnalytics;
#[derive(Default)]
pub struct SearchAggregator {}
#[allow(dead_code)]
impl SearchAggregator {
pub fn from_query(_: &dyn Any, _: &dyn Any) -> Self {
Self::default()
}
pub fn succeed(&mut self, _: &dyn Any) {}
}
impl MockAnalytics {
#[allow(clippy::new_ret_no_self)]
pub fn new(opt: &Opt) -> (Arc<dyn Analytics>, String) {
let user = find_user_id(&opt.db_path).unwrap_or_default();
(Arc::new(Self), user)
}
}
impl Analytics for MockAnalytics {
// These methods are noop and should be optimized out
fn publish(&self, _event_name: String, _send: Value, _request: Option<&HttpRequest>) {}
fn get_search(&self, _aggregate: super::SearchAggregator) {}
fn post_search(&self, _aggregate: super::SearchAggregator) {}
fn add_documents(
&self,
_documents_query: &UpdateDocumentsQuery,
_index_creation: bool,
_request: &HttpRequest,
) {
}
fn update_documents(
&self,
_documents_query: &UpdateDocumentsQuery,
_index_creation: bool,
_request: &HttpRequest,
) {
}
}

View File

@@ -0,0 +1,84 @@
mod mock_analytics;
// if we are in release mode and the feature analytics was enabled
#[cfg(all(not(debug_assertions), feature = "analytics"))]
mod segment_analytics;
use std::fs;
use std::path::{Path, PathBuf};
use actix_web::HttpRequest;
use once_cell::sync::Lazy;
use platform_dirs::AppDirs;
use serde_json::Value;
use crate::routes::indexes::documents::UpdateDocumentsQuery;
pub use mock_analytics::MockAnalytics;
// if we are in debug mode OR the analytics feature is disabled
// the `SegmentAnalytics` point to the mock instead of the real analytics
#[cfg(any(debug_assertions, not(feature = "analytics")))]
pub type SegmentAnalytics = mock_analytics::MockAnalytics;
#[cfg(any(debug_assertions, not(feature = "analytics")))]
pub type SearchAggregator = mock_analytics::SearchAggregator;
// if we are in release mode and the feature analytics was enabled
// we use the real analytics
#[cfg(all(not(debug_assertions), feature = "analytics"))]
pub type SegmentAnalytics = segment_analytics::SegmentAnalytics;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
pub type SearchAggregator = segment_analytics::SearchAggregator;
/// The MeiliSearch config dir:
/// `~/.config/MeiliSearch` on *NIX or *BSD.
/// `~/Library/ApplicationSupport` on macOS.
/// `%APPDATA` (= `C:\Users%USERNAME%\AppData\Roaming`) on windows.
static MEILISEARCH_CONFIG_PATH: Lazy<Option<PathBuf>> =
Lazy::new(|| AppDirs::new(Some("MeiliSearch"), false).map(|appdir| appdir.config_dir));
fn config_user_id_path(db_path: &Path) -> Option<PathBuf> {
db_path
.canonicalize()
.ok()
.map(|path| {
path.join("instance-uid")
.display()
.to_string()
.replace("/", "-")
})
.zip(MEILISEARCH_CONFIG_PATH.as_ref())
.map(|(filename, config_path)| config_path.join(filename.trim_start_matches('-')))
}
/// Look for the instance-uid in the `data.ms` or in `~/.config/MeiliSearch/path-to-db-instance-uid`
fn find_user_id(db_path: &Path) -> Option<String> {
fs::read_to_string(db_path.join("instance-uid"))
.ok()
.or_else(|| fs::read_to_string(&config_user_id_path(db_path)?).ok())
}
pub trait Analytics: Sync + Send {
/// The method used to publish most analytics that do not need to be batched every hours
fn publish(&self, event_name: String, send: Value, request: Option<&HttpRequest>);
/// This method should be called to aggergate a get search
fn get_search(&self, aggregate: SearchAggregator);
/// This method should be called to aggregate a post search
fn post_search(&self, aggregate: SearchAggregator);
// this method should be called to aggregate a add documents request
fn add_documents(
&self,
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
);
// this method should be called to batch a update documents request
fn update_documents(
&self,
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
);
}

View File

@@ -0,0 +1,547 @@
use std::collections::{BinaryHeap, HashMap, HashSet};
use std::fs;
use std::path::Path;
use std::sync::Arc;
use std::time::{Duration, Instant};
use actix_web::http::header::USER_AGENT;
use actix_web::HttpRequest;
use http::header::CONTENT_TYPE;
use meilisearch_lib::index::{SearchQuery, SearchResult};
use meilisearch_lib::index_controller::Stats;
use meilisearch_lib::MeiliSearch;
use once_cell::sync::Lazy;
use regex::Regex;
use segment::message::{Identify, Track, User};
use segment::{AutoBatcher, Batcher, HttpClient};
use serde_json::{json, Value};
use sysinfo::{DiskExt, System, SystemExt};
use tokio::select;
use tokio::sync::mpsc::{self, Receiver, Sender};
use uuid::Uuid;
use crate::analytics::Analytics;
use crate::routes::indexes::documents::UpdateDocumentsQuery;
use crate::Opt;
use super::{config_user_id_path, MEILISEARCH_CONFIG_PATH};
/// Write the instance-uid in the `data.ms` and in `~/.config/MeiliSearch/path-to-db-instance-uid`. Ignore the errors.
fn write_user_id(db_path: &Path, user_id: &str) {
let _ = fs::write(db_path.join("instance-uid"), user_id.as_bytes());
if let Some((meilisearch_config_path, user_id_path)) = MEILISEARCH_CONFIG_PATH
.as_ref()
.zip(config_user_id_path(db_path))
{
let _ = fs::create_dir_all(&meilisearch_config_path);
let _ = fs::write(user_id_path, user_id.as_bytes());
}
}
const SEGMENT_API_KEY: &str = "P3FWhhEsJiEDCuEHpmcN9DHcK4hVfBvb";
pub fn extract_user_agents(request: &HttpRequest) -> Vec<String> {
request
.headers()
.get(USER_AGENT)
.map(|header| header.to_str().ok())
.flatten()
.unwrap_or("unknown")
.split(';')
.map(str::trim)
.map(ToString::to_string)
.collect()
}
pub enum AnalyticsMsg {
BatchMessage(Track),
AggregateGetSearch(SearchAggregator),
AggregatePostSearch(SearchAggregator),
AggregateAddDocuments(DocumentsAggregator),
AggregateUpdateDocuments(DocumentsAggregator),
}
pub struct SegmentAnalytics {
sender: Sender<AnalyticsMsg>,
user: User,
}
impl SegmentAnalytics {
pub async fn new(opt: &Opt, meilisearch: &MeiliSearch) -> (Arc<dyn Analytics>, String) {
let user_id = super::find_user_id(&opt.db_path);
let first_time_run = user_id.is_none();
let user_id = user_id.unwrap_or_else(|| Uuid::new_v4().to_string());
write_user_id(&opt.db_path, &user_id);
let client = HttpClient::default();
let user = User::UserId { user_id };
let mut batcher = AutoBatcher::new(client, Batcher::new(None), SEGMENT_API_KEY.to_string());
// If MeiliSearch is Launched for the first time:
// 1. Send an event Launched associated to the user `total_launch`.
// 2. Batch an event Launched with the real instance-id and send it in one hour.
if first_time_run {
let _ = batcher
.push(Track {
user: User::UserId {
user_id: "total_launch".to_string(),
},
event: "Launched".to_string(),
..Default::default()
})
.await;
let _ = batcher.flush().await;
let _ = batcher
.push(Track {
user: user.clone(),
event: "Launched".to_string(),
..Default::default()
})
.await;
}
let (sender, inbox) = mpsc::channel(100); // How many analytics can we bufferize
let segment = Box::new(Segment {
inbox,
user: user.clone(),
opt: opt.clone(),
batcher,
post_search_aggregator: SearchAggregator::default(),
get_search_aggregator: SearchAggregator::default(),
add_documents_aggregator: DocumentsAggregator::default(),
update_documents_aggregator: DocumentsAggregator::default(),
});
tokio::spawn(segment.run(meilisearch.clone()));
let this = Self {
sender,
user: user.clone(),
};
(Arc::new(this), user.to_string())
}
}
impl super::Analytics for SegmentAnalytics {
fn publish(&self, event_name: String, mut send: Value, request: Option<&HttpRequest>) {
let user_agent = request
.map(|req| req.headers().get(USER_AGENT))
.flatten()
.map(|header| header.to_str().unwrap_or("unknown"))
.map(|s| s.split(';').map(str::trim).collect::<Vec<&str>>());
send["user-agent"] = json!(user_agent);
let event = Track {
user: self.user.clone(),
event: event_name.clone(),
properties: send,
..Default::default()
};
let _ = self
.sender
.try_send(AnalyticsMsg::BatchMessage(event.into()));
}
fn get_search(&self, aggregate: SearchAggregator) {
let _ = self
.sender
.try_send(AnalyticsMsg::AggregateGetSearch(aggregate));
}
fn post_search(&self, aggregate: SearchAggregator) {
let _ = self
.sender
.try_send(AnalyticsMsg::AggregatePostSearch(aggregate));
}
fn add_documents(
&self,
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
) {
let aggregate = DocumentsAggregator::from_query(documents_query, index_creation, request);
let _ = self
.sender
.try_send(AnalyticsMsg::AggregateAddDocuments(aggregate));
}
fn update_documents(
&self,
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
) {
let aggregate = DocumentsAggregator::from_query(documents_query, index_creation, request);
let _ = self
.sender
.try_send(AnalyticsMsg::AggregateUpdateDocuments(aggregate));
}
}
pub struct Segment {
inbox: Receiver<AnalyticsMsg>,
user: User,
opt: Opt,
batcher: AutoBatcher,
get_search_aggregator: SearchAggregator,
post_search_aggregator: SearchAggregator,
add_documents_aggregator: DocumentsAggregator,
update_documents_aggregator: DocumentsAggregator,
}
impl Segment {
fn compute_traits(opt: &Opt, stats: Stats) -> Value {
static FIRST_START_TIMESTAMP: Lazy<Instant> = Lazy::new(Instant::now);
static SYSTEM: Lazy<Value> = Lazy::new(|| {
let mut sys = System::new_all();
sys.refresh_all();
let kernel_version = sys
.kernel_version()
.map(|k| k.split_once("-").map(|(k, _)| k.to_string()))
.flatten();
json!({
"distribution": sys.name(),
"kernel_version": kernel_version,
"cores": sys.processors().len(),
"ram_size": sys.total_memory(),
"disk_size": sys.disks().iter().map(|disk| disk.total_space()).max(),
"server_provider": std::env::var("MEILI_SERVER_PROVIDER").ok(),
})
});
let infos = json!({
"env": opt.env.clone(),
"has_snapshot": opt.schedule_snapshot,
});
let number_of_documents = stats
.indexes
.values()
.map(|index| index.number_of_documents)
.collect::<Vec<u64>>();
json!({
"start_since_days": FIRST_START_TIMESTAMP.elapsed().as_secs() / (60 * 60 * 24), // one day
"system": *SYSTEM,
"stats": {
"database_size": stats.database_size,
"indexes_number": stats.indexes.len(),
"documents_number": number_of_documents,
},
"infos": infos,
})
}
async fn run(mut self, meilisearch: MeiliSearch) {
const INTERVAL: Duration = Duration::from_secs(60 * 60); // one hour
// The first batch must be sent after one hour.
let mut interval =
tokio::time::interval_at(tokio::time::Instant::now() + INTERVAL, INTERVAL);
loop {
select! {
_ = interval.tick() => {
self.tick(meilisearch.clone()).await;
},
msg = self.inbox.recv() => {
match msg {
Some(AnalyticsMsg::BatchMessage(msg)) => drop(self.batcher.push(msg).await),
Some(AnalyticsMsg::AggregateGetSearch(agreg)) => self.get_search_aggregator.aggregate(agreg),
Some(AnalyticsMsg::AggregatePostSearch(agreg)) => self.post_search_aggregator.aggregate(agreg),
Some(AnalyticsMsg::AggregateAddDocuments(agreg)) => self.add_documents_aggregator.aggregate(agreg),
Some(AnalyticsMsg::AggregateUpdateDocuments(agreg)) => self.update_documents_aggregator.aggregate(agreg),
None => (),
}
}
}
}
}
async fn tick(&mut self, meilisearch: MeiliSearch) {
if let Ok(stats) = meilisearch.get_all_stats(&None).await {
let _ = self
.batcher
.push(Identify {
context: Some(json!({
"app": {
"version": env!("CARGO_PKG_VERSION").to_string(),
},
})),
user: self.user.clone(),
traits: Self::compute_traits(&self.opt, stats),
..Default::default()
})
.await;
}
let get_search = std::mem::take(&mut self.get_search_aggregator)
.into_event(&self.user, "Documents Searched GET");
let post_search = std::mem::take(&mut self.post_search_aggregator)
.into_event(&self.user, "Documents Searched POST");
let add_documents = std::mem::take(&mut self.add_documents_aggregator)
.into_event(&self.user, "Documents Added");
let update_documents = std::mem::take(&mut self.update_documents_aggregator)
.into_event(&self.user, "Documents Updated");
if let Some(get_search) = get_search {
let _ = self.batcher.push(get_search).await;
}
if let Some(post_search) = post_search {
let _ = self.batcher.push(post_search).await;
}
if let Some(add_documents) = add_documents {
let _ = self.batcher.push(add_documents).await;
}
if let Some(update_documents) = update_documents {
let _ = self.batcher.push(update_documents).await;
}
let _ = self.batcher.flush().await;
}
}
#[derive(Default)]
pub struct SearchAggregator {
// context
user_agents: HashSet<String>,
// requests
total_received: usize,
total_succeeded: usize,
time_spent: BinaryHeap<usize>,
// sort
sort_with_geo_point: bool,
// everytime a request has a filter, this field must be incremented by the number of terms it contains
sort_sum_of_criteria_terms: usize,
// everytime a request has a filter, this field must be incremented by one
sort_total_number_of_criteria: usize,
// filter
filter_with_geo_radius: bool,
// everytime a request has a filter, this field must be incremented by the number of terms it contains
filter_sum_of_criteria_terms: usize,
// everytime a request has a filter, this field must be incremented by one
filter_total_number_of_criteria: usize,
used_syntax: HashMap<String, usize>,
// q
// The maximum number of terms in a q request
max_terms_number: usize,
// pagination
max_limit: usize,
max_offset: usize,
}
impl SearchAggregator {
pub fn from_query(query: &SearchQuery, request: &HttpRequest) -> Self {
let mut ret = Self::default();
ret.total_received = 1;
ret.user_agents = extract_user_agents(request).into_iter().collect();
if let Some(ref sort) = query.sort {
ret.sort_total_number_of_criteria = 1;
ret.sort_with_geo_point = sort.iter().any(|s| s.contains("_geoPoint("));
ret.sort_sum_of_criteria_terms = sort.len();
}
if let Some(ref filter) = query.filter {
static RE: Lazy<Regex> = Lazy::new(|| Regex::new("AND | OR").unwrap());
ret.filter_total_number_of_criteria = 1;
let syntax = match filter {
Value::String(_) => "string".to_string(),
Value::Array(values) => {
if values
.iter()
.map(|v| v.to_string())
.any(|s| RE.is_match(&s))
{
"mixed".to_string()
} else {
"array".to_string()
}
}
_ => "none".to_string(),
};
// convert the string to a HashMap
ret.used_syntax.insert(syntax, 1);
let stringified_filters = filter.to_string();
ret.filter_with_geo_radius = stringified_filters.contains("_geoRadius(");
ret.filter_sum_of_criteria_terms = RE.split(&stringified_filters).count();
}
if let Some(ref q) = query.q {
ret.max_terms_number = q.split_whitespace().count();
}
ret.max_limit = query.limit;
ret.max_offset = query.offset.unwrap_or_default();
ret
}
pub fn succeed(&mut self, result: &SearchResult) {
self.total_succeeded = self.total_succeeded.saturating_add(1);
self.time_spent.push(result.processing_time_ms as usize);
}
/// Aggregate one [SearchAggregator] into another.
pub fn aggregate(&mut self, mut other: Self) {
// context
for user_agent in other.user_agents.into_iter() {
self.user_agents.insert(user_agent);
}
// request
self.total_received = self.total_received.saturating_add(other.total_received);
self.total_succeeded = self.total_succeeded.saturating_add(other.total_succeeded);
self.time_spent.append(&mut other.time_spent);
// sort
self.sort_with_geo_point |= other.sort_with_geo_point;
self.sort_sum_of_criteria_terms = self
.sort_sum_of_criteria_terms
.saturating_add(other.sort_sum_of_criteria_terms);
self.sort_total_number_of_criteria = self
.sort_total_number_of_criteria
.saturating_add(other.sort_total_number_of_criteria);
// filter
self.filter_with_geo_radius |= other.filter_with_geo_radius;
self.filter_sum_of_criteria_terms = self
.filter_sum_of_criteria_terms
.saturating_add(other.filter_sum_of_criteria_terms);
self.filter_total_number_of_criteria = self
.filter_total_number_of_criteria
.saturating_add(other.filter_total_number_of_criteria);
for (key, value) in other.used_syntax.into_iter() {
let used_syntax = self.used_syntax.entry(key).or_insert(0);
*used_syntax = used_syntax.saturating_add(value);
}
// q
self.max_terms_number = self.max_terms_number.max(other.max_terms_number);
// pagination
self.max_limit = self.max_limit.max(other.max_limit);
self.max_offset = self.max_offset.max(other.max_offset);
}
pub fn into_event(self, user: &User, event_name: &str) -> Option<Track> {
if self.total_received == 0 {
None
} else {
// the index of the 99th percentage of value
let percentile_99th = 0.99 * (self.total_succeeded as f64 - 1.) + 1.;
// we get all the values in a sorted manner
let time_spent = self.time_spent.into_sorted_vec();
// We are only intersted by the slowest value of the 99th fastest results
let time_spent = time_spent.get(percentile_99th as usize);
let properties = json!({
"user-agent": self.user_agents,
"requests": {
"99th_response_time": time_spent.map(|t| format!("{:.2}", t)),
"total_succeeded": self.total_succeeded,
"total_failed": self.total_received.saturating_sub(self.total_succeeded), // just to be sure we never panics
"total_received": self.total_received,
},
"sort": {
"with_geoPoint": self.sort_with_geo_point,
"avg_criteria_number": format!("{:.2}", self.sort_sum_of_criteria_terms as f64 / self.sort_total_number_of_criteria as f64),
},
"filter": {
"with_geoRadius": self.filter_with_geo_radius,
"avg_criteria_number": format!("{:.2}", self.filter_sum_of_criteria_terms as f64 / self.filter_total_number_of_criteria as f64),
"most_used_syntax": self.used_syntax.iter().max_by_key(|(_, v)| *v).map(|(k, _)| json!(k)).unwrap_or_else(|| json!(null)),
},
"q": {
"max_terms_number": self.max_terms_number,
},
"pagination": {
"max_limit": self.max_limit,
"max_offset": self.max_offset,
},
});
Some(Track {
user: user.clone(),
event: event_name.to_string(),
properties,
..Default::default()
})
}
}
}
#[derive(Default)]
pub struct DocumentsAggregator {
// set to true when at least one request was received
updated: bool,
// context
user_agents: HashSet<String>,
content_types: HashSet<String>,
primary_keys: HashSet<String>,
index_creation: bool,
}
impl DocumentsAggregator {
pub fn from_query(
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
) -> Self {
let mut ret = Self::default();
ret.updated = true;
ret.user_agents = extract_user_agents(request).into_iter().collect();
if let Some(primary_key) = documents_query.primary_key.clone() {
ret.primary_keys.insert(primary_key);
}
let content_type = request
.headers()
.get(CONTENT_TYPE)
.map(|s| s.to_str().unwrap_or("unkown"))
.unwrap()
.to_string();
ret.content_types.insert(content_type);
ret.index_creation = index_creation;
ret
}
/// Aggregate one [DocumentsAggregator] into another.
pub fn aggregate(&mut self, other: Self) {
self.updated |= other.updated;
// we can't create a union because there is no `into_union` method
for user_agent in other.user_agents.into_iter() {
self.user_agents.insert(user_agent);
}
for primary_key in other.primary_keys.into_iter() {
self.primary_keys.insert(primary_key);
}
for content_type in other.content_types.into_iter() {
self.content_types.insert(content_type);
}
self.index_creation |= other.index_creation;
}
pub fn into_event(self, user: &User, event_name: &str) -> Option<Track> {
if !self.updated {
None
} else {
let properties = json!({
"user-agent": self.user_agents,
"payload_type": self.content_types,
"primary_key": self.primary_keys,
"index_creation": self.index_creation,
});
Some(Track {
user: user.clone(),
event: event_name.to_string(),
properties,
..Default::default()
})
}
}
}

View File

@@ -1,133 +0,0 @@
use std::ops::Deref;
use std::sync::Arc;
use sha2::Digest;
use crate::index::{Checked, Settings};
use crate::index_controller::{
error::Result, DumpInfo, IndexController, IndexMetadata, IndexSettings, IndexStats, Stats,
};
use crate::option::Opt;
pub mod search;
mod updates;
#[derive(Clone)]
pub struct Data {
inner: Arc<DataInner>,
}
impl Deref for Data {
type Target = DataInner;
fn deref(&self) -> &Self::Target {
&self.inner
}
}
pub struct DataInner {
pub index_controller: IndexController,
pub api_keys: ApiKeys,
options: Opt,
}
#[derive(Clone)]
pub struct ApiKeys {
pub public: Option<String>,
pub private: Option<String>,
pub master: Option<String>,
}
impl ApiKeys {
pub fn generate_missing_api_keys(&mut self) {
if let Some(master_key) = &self.master {
if self.private.is_none() {
let key = format!("{}-private", master_key);
let sha = sha2::Sha256::digest(key.as_bytes());
self.private = Some(format!("{:x}", sha));
}
if self.public.is_none() {
let key = format!("{}-public", master_key);
let sha = sha2::Sha256::digest(key.as_bytes());
self.public = Some(format!("{:x}", sha));
}
}
}
}
impl Data {
pub fn new(options: Opt) -> anyhow::Result<Data> {
let path = options.db_path.clone();
let index_controller = IndexController::new(&path, &options)?;
let mut api_keys = ApiKeys {
master: options.clone().master_key,
private: None,
public: None,
};
api_keys.generate_missing_api_keys();
let inner = DataInner {
index_controller,
api_keys,
options,
};
let inner = Arc::new(inner);
Ok(Data { inner })
}
pub async fn settings(&self, uid: String) -> Result<Settings<Checked>> {
self.index_controller.settings(uid).await
}
pub async fn list_indexes(&self) -> Result<Vec<IndexMetadata>> {
self.index_controller.list_indexes().await
}
pub async fn index(&self, uid: String) -> Result<IndexMetadata> {
self.index_controller.get_index(uid).await
}
pub async fn create_index(
&self,
uid: String,
primary_key: Option<String>,
) -> Result<IndexMetadata> {
let settings = IndexSettings {
uid: Some(uid),
primary_key,
};
let meta = self.index_controller.create_index(settings).await?;
Ok(meta)
}
pub async fn get_index_stats(&self, uid: String) -> Result<IndexStats> {
Ok(self.index_controller.get_index_stats(uid).await?)
}
pub async fn get_all_stats(&self) -> Result<Stats> {
Ok(self.index_controller.get_all_stats().await?)
}
pub async fn create_dump(&self) -> Result<DumpInfo> {
Ok(self.index_controller.create_dump().await?)
}
pub async fn dump_status(&self, uid: String) -> Result<DumpInfo> {
Ok(self.index_controller.dump_info(uid).await?)
}
#[inline]
pub fn http_payload_size_limit(&self) -> usize {
self.options.http_payload_size_limit.get_bytes() as usize
}
#[inline]
pub fn api_keys(&self) -> &ApiKeys {
&self.api_keys
}
}

View File

@@ -1,34 +0,0 @@
use serde_json::{Map, Value};
use super::Data;
use crate::index::{SearchQuery, SearchResult};
use crate::index_controller::error::Result;
impl Data {
pub async fn search(&self, index: String, search_query: SearchQuery) -> Result<SearchResult> {
self.index_controller.search(index, search_query).await
}
pub async fn retrieve_documents(
&self,
index: String,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Map<String, Value>>> {
self.index_controller
.documents(index, offset, limit, attributes_to_retrieve)
.await
}
pub async fn retrieve_document(
&self,
index: String,
document_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Map<String, Value>> {
self.index_controller
.document(index, document_id, attributes_to_retrieve)
.await
}
}

View File

@@ -1,80 +0,0 @@
use milli::update::{IndexDocumentsMethod, UpdateFormat};
use crate::extractors::payload::Payload;
use crate::index::{Checked, Settings};
use crate::index_controller::{error::Result, IndexMetadata, IndexSettings, UpdateStatus};
use crate::Data;
impl Data {
pub async fn add_documents(
&self,
index: String,
method: IndexDocumentsMethod,
format: UpdateFormat,
stream: Payload,
primary_key: Option<String>,
) -> Result<UpdateStatus> {
let update_status = self
.index_controller
.add_documents(index, method, format, stream, primary_key)
.await?;
Ok(update_status)
}
pub async fn update_settings(
&self,
index: String,
settings: Settings<Checked>,
create: bool,
) -> Result<UpdateStatus> {
let update = self
.index_controller
.update_settings(index, settings, create)
.await?;
Ok(update)
}
pub async fn clear_documents(&self, index: String) -> Result<UpdateStatus> {
let update = self.index_controller.clear_documents(index).await?;
Ok(update)
}
pub async fn delete_documents(
&self,
index: String,
document_ids: Vec<String>,
) -> Result<UpdateStatus> {
let update = self
.index_controller
.delete_documents(index, document_ids)
.await?;
Ok(update)
}
pub async fn delete_index(&self, index: String) -> Result<()> {
self.index_controller.delete_index(index).await?;
Ok(())
}
pub async fn get_update_status(&self, index: String, uid: u64) -> Result<UpdateStatus> {
self.index_controller.update_status(index, uid).await
}
pub async fn get_updates_status(&self, index: String) -> Result<Vec<UpdateStatus>> {
self.index_controller.all_update_status(index).await
}
pub async fn update_index(
&self,
uid: String,
primary_key: Option<String>,
new_uid: Option<String>,
) -> Result<IndexMetadata> {
let settings = IndexSettings {
uid: new_uid,
primary_key,
};
self.index_controller.update_index(uid, settings).await
}
}

View File

@@ -1,145 +1,57 @@
use std::error::Error;
use std::fmt;
use actix_web as aweb;
use actix_web::body::Body;
use actix_web::dev::BaseHttpResponseBuilder;
use actix_web::http::StatusCode;
use aweb::error::{JsonPayloadError, QueryPayloadError};
use meilisearch_error::{Code, ErrorCode};
use milli::UserError;
use serde::{Deserialize, Serialize};
use meilisearch_error::{Code, ErrorCode, ResponseError};
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct ResponseError {
#[serde(skip)]
code: StatusCode,
message: String,
error_code: String,
error_type: String,
error_link: String,
#[derive(Debug, thiserror::Error)]
pub enum MeilisearchHttpError {
#[error("A Content-Type header is missing. Accepted values for the Content-Type header are: {}",
.0.iter().map(|s| format!("`{}`", s)).collect::<Vec<_>>().join(", "))]
MissingContentType(Vec<String>),
#[error(
"The Content-Type `{0}` is invalid. Accepted values for the Content-Type header are: {}",
.1.iter().map(|s| format!("`{}`", s)).collect::<Vec<_>>().join(", ")
)]
InvalidContentType(String, Vec<String>),
}
impl fmt::Display for ResponseError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.message.fmt(f)
}
}
impl<T> From<T> for ResponseError
where
T: ErrorCode,
{
fn from(other: T) -> Self {
Self {
code: other.http_status(),
message: other.to_string(),
error_code: other.error_name(),
error_type: other.error_type(),
error_link: other.error_url(),
}
}
}
impl aweb::error::ResponseError for ResponseError {
fn error_response(&self) -> aweb::BaseHttpResponse<Body> {
let json = serde_json::to_vec(self).unwrap();
BaseHttpResponseBuilder::new(self.status_code())
.content_type("application/json")
.body(json)
}
fn status_code(&self) -> StatusCode {
self.code
}
}
macro_rules! internal_error {
($target:ty : $($other:path), *) => {
$(
impl From<$other> for $target {
fn from(other: $other) -> Self {
Self::Internal(Box::new(other))
}
}
)*
}
}
#[derive(Debug)]
pub struct MilliError<'a>(pub &'a milli::Error);
impl Error for MilliError<'_> {}
impl fmt::Display for MilliError<'_> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.0.fmt(f)
}
}
impl ErrorCode for MilliError<'_> {
impl ErrorCode for MeilisearchHttpError {
fn error_code(&self) -> Code {
match self.0 {
milli::Error::InternalError(_) => Code::Internal,
milli::Error::IoError(_) => Code::Internal,
milli::Error::UserError(ref error) => {
match error {
// TODO: wait for spec for new error codes.
UserError::Csv(_)
| UserError::SerdeJson(_)
| UserError::MaxDatabaseSizeReached
| UserError::InvalidCriterionName { .. }
| UserError::InvalidDocumentId { .. }
| UserError::InvalidStoreFile
| UserError::NoSpaceLeftOnDevice
| UserError::InvalidAscDescSyntax { .. }
| UserError::DocumentLimitReached => Code::Internal,
UserError::AttributeLimitReached => Code::MaxFieldsLimitExceeded,
UserError::InvalidFilter(_) => Code::Filter,
UserError::InvalidFilterAttribute(_) => Code::Filter,
UserError::InvalidSortName { .. } => Code::Sort,
UserError::MissingDocumentId { .. } => Code::MissingDocumentId,
UserError::MissingPrimaryKey => Code::MissingPrimaryKey,
UserError::PrimaryKeyCannotBeChanged => Code::PrimaryKeyAlreadyPresent,
UserError::PrimaryKeyCannotBeReset => Code::PrimaryKeyAlreadyPresent,
UserError::SortRankingRuleMissing => Code::Sort,
UserError::UnknownInternalDocumentId { .. } => Code::DocumentNotFound,
UserError::InvalidFacetsDistribution { .. } => Code::BadRequest,
UserError::InvalidSortableAttribute { .. } => Code::Sort,
}
}
}
}
}
impl fmt::Display for PayloadError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
PayloadError::Json(e) => e.fmt(f),
PayloadError::Query(e) => e.fmt(f),
MeilisearchHttpError::MissingContentType(_) => Code::MissingContentType,
MeilisearchHttpError::InvalidContentType(_, _) => Code::InvalidContentType,
}
}
}
#[derive(Debug)]
pub enum PayloadError {
Json(JsonPayloadError),
Query(QueryPayloadError),
impl From<MeilisearchHttpError> for aweb::Error {
fn from(other: MeilisearchHttpError) -> Self {
aweb::Error::from(ResponseError::from(other))
}
}
impl Error for PayloadError {}
#[derive(Debug, thiserror::Error)]
pub enum PayloadError {
#[error("{0}")]
Json(JsonPayloadError),
#[error("{0}")]
Query(QueryPayloadError),
#[error("The json payload provided is malformed. `{0}`.")]
MalformedPayload(serde_json::error::Error),
#[error("A json payload is missing.")]
MissingPayload,
}
impl ErrorCode for PayloadError {
fn error_code(&self) -> Code {
match self {
PayloadError::Json(err) => match err {
JsonPayloadError::Overflow => Code::PayloadTooLarge,
JsonPayloadError::Overflow { .. } => Code::PayloadTooLarge,
JsonPayloadError::ContentType => Code::UnsupportedMediaType,
JsonPayloadError::Payload(aweb::error::PayloadError::Overflow) => {
Code::PayloadTooLarge
}
JsonPayloadError::Deserialize(_) | JsonPayloadError::Payload(_) => Code::BadRequest,
JsonPayloadError::Payload(_) => Code::BadRequest,
JsonPayloadError::Deserialize(_) => Code::BadRequest,
JsonPayloadError::Serialize(_) => Code::Internal,
_ => Code::Internal,
},
@@ -147,13 +59,29 @@ impl ErrorCode for PayloadError {
QueryPayloadError::Deserialize(_) => Code::BadRequest,
_ => Code::Internal,
},
PayloadError::MissingPayload => Code::MissingPayload,
PayloadError::MalformedPayload(_) => Code::MalformedPayload,
}
}
}
impl From<JsonPayloadError> for PayloadError {
fn from(other: JsonPayloadError) -> Self {
Self::Json(other)
match other {
JsonPayloadError::Deserialize(e)
if e.classify() == serde_json::error::Category::Eof
&& e.line() == 1
&& e.column() == 0 =>
{
Self::MissingPayload
}
JsonPayloadError::Deserialize(e)
if e.classify() != serde_json::error::Category::Data =>
{
Self::MalformedPayload(e)
}
_ => Self::Json(other),
}
}
}
@@ -163,9 +91,8 @@ impl From<QueryPayloadError> for PayloadError {
}
}
pub fn payload_error_handler<E>(err: E) -> ResponseError
where
E: Into<PayloadError>,
{
err.into().into()
impl From<PayloadError> for aweb::Error {
fn from(other: PayloadError) -> Self {
aweb::Error::from(ResponseError::from(other))
}
}

View File

@@ -2,15 +2,13 @@ use meilisearch_error::{Code, ErrorCode};
#[derive(Debug, thiserror::Error)]
pub enum AuthenticationError {
#[error("You must have an authorization token")]
#[error("The Authorization header is missing. It must use the bearer authorization method.")]
MissingAuthorizationHeader,
#[error("Invalid API key")]
#[error("The provided API key is invalid.")]
InvalidToken(String),
// Triggered on configuration error.
#[error("Irretrievable state")]
#[error("An internal error has occurred. `Irretrievable state`.")]
IrretrievableState,
#[error("Unknown authentication policy")]
UnknownPolicy,
}
impl ErrorCode for AuthenticationError {
@@ -19,7 +17,6 @@ impl ErrorCode for AuthenticationError {
AuthenticationError::MissingAuthorizationHeader => Code::MissingAuthorizationHeader,
AuthenticationError::InvalidToken(_) => Code::InvalidToken,
AuthenticationError::IrretrievableState => Code::Internal,
AuthenticationError::UnknownPolicy => Code::Internal,
}
}
}

View File

@@ -1,83 +1,28 @@
mod error;
use std::any::{Any, TypeId};
use std::collections::HashMap;
use std::marker::PhantomData;
use std::ops::Deref;
use actix_web::FromRequest;
use futures::future::err;
use futures::future::{ok, Ready};
use meilisearch_error::ResponseError;
use crate::error::ResponseError;
use error::AuthenticationError;
macro_rules! create_policies {
($($name:ident), *) => {
pub mod policies {
use std::collections::HashSet;
use crate::extractors::authentication::Policy;
$(
#[derive(Debug, Default)]
pub struct $name {
inner: HashSet<Vec<u8>>
}
impl $name {
pub fn new() -> Self {
Self { inner: HashSet::new() }
}
pub fn add(&mut self, token: Vec<u8>) {
self.inner.insert(token);
}
}
impl Policy for $name {
fn authenticate(&self, token: &[u8]) -> bool {
self.inner.contains(token)
}
}
)*
}
};
}
create_policies!(Public, Private, Admin);
/// Instanciate a `Policies`, filled with the given policies.
macro_rules! init_policies {
($($name:ident), *) => {
{
let mut policies = crate::extractors::authentication::Policies::new();
$(
let policy = $name::new();
policies.insert(policy);
)*
policies
}
};
}
/// Adds user to all specified policies.
macro_rules! create_users {
($policies:ident, $($user:expr => { $($policy:ty), * }), *) => {
{
$(
$(
$policies.get_mut::<$policy>().map(|p| p.add($user.to_owned()));
)*
)*
}
};
}
use meilisearch_auth::{AuthController, AuthFilter};
pub struct GuardedData<T, D> {
data: D,
filters: AuthFilter,
_marker: PhantomData<T>,
}
impl<T, D> GuardedData<T, D> {
pub fn filters(&self) -> &AuthFilter {
&self.filters
}
}
impl<T, D> Deref for GuardedData<T, D> {
type Target = D;
@@ -86,58 +31,8 @@ impl<T, D> Deref for GuardedData<T, D> {
}
}
pub trait Policy {
fn authenticate(&self, token: &[u8]) -> bool;
}
#[derive(Debug)]
pub struct Policies {
inner: HashMap<TypeId, Box<dyn Any>>,
}
impl Policies {
pub fn new() -> Self {
Self {
inner: HashMap::new(),
}
}
pub fn insert<S: Policy + 'static>(&mut self, policy: S) {
self.inner.insert(TypeId::of::<S>(), Box::new(policy));
}
pub fn get<S: Policy + 'static>(&self) -> Option<&S> {
self.inner
.get(&TypeId::of::<S>())
.and_then(|p| p.downcast_ref::<S>())
}
pub fn get_mut<S: Policy + 'static>(&mut self) -> Option<&mut S> {
self.inner
.get_mut(&TypeId::of::<S>())
.and_then(|p| p.downcast_mut::<S>())
}
}
impl Default for Policies {
fn default() -> Self {
Self::new()
}
}
pub enum AuthConfig {
NoAuth,
Auth(Policies),
}
impl Default for AuthConfig {
fn default() -> Self {
Self::NoAuth
}
}
impl<P: Policy + 'static, D: 'static + Clone> FromRequest for GuardedData<P, D> {
type Config = AuthConfig;
type Config = ();
type Error = ResponseError;
@@ -145,38 +40,103 @@ impl<P: Policy + 'static, D: 'static + Clone> FromRequest for GuardedData<P, D>
fn from_request(
req: &actix_web::HttpRequest,
_payload: &mut actix_http::Payload,
_payload: &mut actix_web::dev::Payload,
) -> Self::Future {
match req.app_data::<Self::Config>() {
Some(config) => match config {
AuthConfig::NoAuth => match req.app_data::<D>().cloned() {
Some(data) => ok(Self {
data,
_marker: PhantomData,
}),
None => err(AuthenticationError::IrretrievableState.into()),
},
AuthConfig::Auth(policies) => match policies.get::<P>() {
Some(policy) => match req.headers().get("x-meili-api-key") {
Some(token) => {
if policy.authenticate(token.as_bytes()) {
match req.app_data::<D>().cloned() {
Some(data) => ok(Self {
data,
_marker: PhantomData,
}),
None => err(AuthenticationError::IrretrievableState.into()),
}
} else {
err(AuthenticationError::InvalidToken(String::from("hello")).into())
match req.app_data::<AuthController>().cloned() {
Some(auth) => match req
.headers()
.get("Authorization")
.map(|type_token| type_token.to_str().unwrap_or_default().splitn(2, ' '))
{
Some(mut type_token) => match type_token.next() {
Some("Bearer") => {
// TODO: find a less hardcoded way?
let index = req.match_info().get("index_uid");
let token = type_token.next().unwrap_or("unknown");
match P::authenticate(auth, token, index) {
Some(filters) => match req.app_data::<D>().cloned() {
Some(data) => ok(Self {
data,
filters,
_marker: PhantomData,
}),
None => err(AuthenticationError::IrretrievableState.into()),
},
None => {
let token = token.to_string();
err(AuthenticationError::InvalidToken(token).into())
}
}
None => err(AuthenticationError::MissingAuthorizationHeader.into()),
}
_otherwise => err(AuthenticationError::MissingAuthorizationHeader.into()),
},
None => match P::authenticate(auth, "", None) {
Some(filters) => match req.app_data::<D>().cloned() {
Some(data) => ok(Self {
data,
filters,
_marker: PhantomData,
}),
None => err(AuthenticationError::IrretrievableState.into()),
},
None => err(AuthenticationError::UnknownPolicy.into()),
None => err(AuthenticationError::MissingAuthorizationHeader.into()),
},
},
None => err(AuthenticationError::IrretrievableState.into()),
}
}
}
pub trait Policy {
fn authenticate(auth: AuthController, token: &str, index: Option<&str>) -> Option<AuthFilter>;
}
pub mod policies {
use crate::extractors::authentication::Policy;
use meilisearch_auth::{Action, AuthController, AuthFilter};
// reexport actions in policies in order to be used in routes configuration.
pub use meilisearch_auth::actions;
pub struct MasterPolicy;
impl Policy for MasterPolicy {
fn authenticate(
auth: AuthController,
token: &str,
_index: Option<&str>,
) -> Option<AuthFilter> {
if let Some(master_key) = auth.get_master_key() {
if master_key == token {
return Some(AuthFilter::default());
}
}
None
}
}
pub struct ActionPolicy<const A: u8>;
impl<const A: u8> Policy for ActionPolicy<A> {
fn authenticate(
auth: AuthController,
token: &str,
index: Option<&str>,
) -> Option<AuthFilter> {
// authenticate if token is the master key.
if auth.get_master_key().map_or(true, |mk| mk == token) {
return Some(AuthFilter::default());
}
// authenticate if token is allowed.
if let Some(action) = Action::from_repr(A) {
let index = index.map(|i| i.as_bytes());
if let Ok(true) = auth.authenticate(token.as_bytes(), action, index) {
return auth.get_key_filters(token).ok();
}
}
None
}
}
}

View File

@@ -1,7 +1,7 @@
use std::pin::Pin;
use std::task::{Context, Poll};
use actix_http::error::PayloadError;
use actix_web::error::PayloadError;
use actix_web::{dev, web, FromRequest, HttpRequest};
use futures::future::{ready, Ready};
use futures::Stream;

View File

@@ -1,4 +1,3 @@
pub mod compression;
mod env;
pub use env::EnvSizer;

View File

@@ -1,170 +0,0 @@
use std::fs::{create_dir_all, File};
use std::io::{BufRead, BufReader, Write};
use std::path::Path;
use std::sync::Arc;
use anyhow::{bail, Context};
use heed::RoTxn;
use indexmap::IndexMap;
use milli::update::{IndexDocumentsMethod, UpdateFormat::JsonStream};
use serde::{Deserialize, Serialize};
use serde_json::Value;
use crate::index_controller::{asc_ranking_rule, desc_ranking_rule};
use crate::option::IndexerOpts;
use super::error::Result;
use super::{update_handler::UpdateHandler, Index, Settings, Unchecked};
#[derive(Serialize, Deserialize)]
struct DumpMeta {
settings: Settings<Unchecked>,
primary_key: Option<String>,
}
const META_FILE_NAME: &str = "meta.json";
const DATA_FILE_NAME: &str = "documents.jsonl";
impl Index {
pub fn dump(&self, path: impl AsRef<Path>) -> Result<()> {
// acquire write txn make sure any ongoing write is finished before we start.
let txn = self.env.write_txn()?;
self.dump_documents(&txn, &path)?;
self.dump_meta(&txn, &path)?;
Ok(())
}
fn dump_documents(&self, txn: &RoTxn, path: impl AsRef<Path>) -> Result<()> {
let document_file_path = path.as_ref().join(DATA_FILE_NAME);
let mut document_file = File::create(&document_file_path)?;
let documents = self.all_documents(txn)?;
let fields_ids_map = self.fields_ids_map(txn)?;
// dump documents
let mut json_map = IndexMap::new();
for document in documents {
let (_, reader) = document?;
for (fid, bytes) in reader.iter() {
if let Some(name) = fields_ids_map.name(fid) {
json_map.insert(name, serde_json::from_slice::<serde_json::Value>(bytes)?);
}
}
serde_json::to_writer(&mut document_file, &json_map)?;
document_file.write_all(b"\n")?;
json_map.clear();
}
Ok(())
}
fn dump_meta(&self, txn: &RoTxn, path: impl AsRef<Path>) -> Result<()> {
let meta_file_path = path.as_ref().join(META_FILE_NAME);
let mut meta_file = File::create(&meta_file_path)?;
let settings = self.settings_txn(txn)?.into_unchecked();
let primary_key = self.primary_key(txn)?.map(String::from);
let meta = DumpMeta {
settings,
primary_key,
};
serde_json::to_writer(&mut meta_file, &meta)?;
Ok(())
}
pub fn load_dump(
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
size: usize,
indexing_options: &IndexerOpts,
) -> anyhow::Result<()> {
let dir_name = src
.as_ref()
.file_name()
.with_context(|| format!("invalid dump index: {}", src.as_ref().display()))?;
let dst_dir_path = dst.as_ref().join("indexes").join(dir_name);
create_dir_all(&dst_dir_path)?;
let meta_path = src.as_ref().join(META_FILE_NAME);
let mut meta_file = File::open(meta_path)?;
// We first deserialize the dump meta into a serde_json::Value and change
// the custom ranking rules settings from the old format to the new format.
let mut meta: Value = serde_json::from_reader(&mut meta_file)?;
if let Some(ranking_rules) = meta.pointer_mut("/settings/rankingRules") {
convert_custom_ranking_rules(ranking_rules);
}
// Then we serialize it back into a vec to deserialize it
// into a `DumpMeta` struct with the newly patched `rankingRules` format.
let patched_meta = serde_json::to_vec(&meta)?;
let DumpMeta {
settings,
primary_key,
} = serde_json::from_slice(&patched_meta)?;
let settings = settings.check();
let index = Self::open(&dst_dir_path, size)?;
let mut txn = index.write_txn()?;
let handler = UpdateHandler::new(indexing_options)?;
index.update_settings_txn(&mut txn, &settings, handler.update_builder(0))?;
let document_file_path = src.as_ref().join(DATA_FILE_NAME);
let reader = File::open(&document_file_path)?;
let mut reader = BufReader::new(reader);
reader.fill_buf()?;
// If the document file is empty, we don't perform the document addition, to prevent
// a primary key error to be thrown.
if !reader.buffer().is_empty() {
index.update_documents_txn(
&mut txn,
JsonStream,
IndexDocumentsMethod::UpdateDocuments,
Some(reader),
handler.update_builder(0),
primary_key.as_deref(),
)?;
}
txn.commit()?;
match Arc::try_unwrap(index.0) {
Ok(inner) => inner.prepare_for_closing().wait(),
Err(_) => bail!("Could not close index properly."),
}
Ok(())
}
}
/// Converts the ranking rules from the format `asc(_)`, `desc(_)` to the format `_:asc`, `_:desc`.
///
/// This is done for compatibility reasons, and to avoid a new dump version,
/// since the new syntax was introduced soon after the new dump version.
fn convert_custom_ranking_rules(ranking_rules: &mut Value) {
*ranking_rules = match ranking_rules.take() {
Value::Array(values) => values
.into_iter()
.filter_map(|value| match value {
Value::String(s) if s.starts_with("asc") => asc_ranking_rule(&s)
.map(|f| format!("{}:asc", f))
.map(Value::String),
Value::String(s) if s.starts_with("desc") => desc_ranking_rule(&s)
.map(|f| format!("{}:desc", f))
.map(Value::String),
otherwise => Some(otherwise),
})
.collect(),
otherwise => otherwise,
}
}

View File

@@ -1,52 +0,0 @@
use meilisearch_error::{Code, ErrorCode};
use crate::index_controller::update_actor::error::UpdateActorError;
use crate::index_controller::uuid_resolver::error::UuidResolverError;
pub type Result<T> = std::result::Result<T, DumpActorError>;
#[derive(thiserror::Error, Debug)]
pub enum DumpActorError {
#[error("Another dump is already in progress")]
DumpAlreadyRunning,
#[error("Dump `{0}` not found")]
DumpDoesNotExist(String),
#[error("Internal error: {0}")]
Internal(Box<dyn std::error::Error + Send + Sync + 'static>),
#[error("{0}")]
UuidResolver(#[from] UuidResolverError),
#[error("{0}")]
UpdateActor(#[from] UpdateActorError),
}
macro_rules! internal_error {
($($other:path), *) => {
$(
impl From<$other> for DumpActorError {
fn from(other: $other) -> Self {
Self::Internal(Box::new(other))
}
}
)*
}
}
internal_error!(
heed::Error,
std::io::Error,
tokio::task::JoinError,
serde_json::error::Error,
tempfile::PersistError
);
impl ErrorCode for DumpActorError {
fn error_code(&self) -> Code {
match self {
DumpActorError::DumpAlreadyRunning => Code::DumpAlreadyInProgress,
DumpActorError::DumpDoesNotExist(_) => Code::NotFound,
DumpActorError::Internal(_) => Code::Internal,
DumpActorError::UuidResolver(e) => e.error_code(),
DumpActorError::UpdateActor(e) => e.error_code(),
}
}
}

View File

@@ -1,53 +0,0 @@
use std::path::Path;
use actix_web::web::Bytes;
use tokio::sync::{mpsc, oneshot};
use super::error::Result;
use super::{DumpActor, DumpActorHandle, DumpInfo, DumpMsg};
#[derive(Clone)]
pub struct DumpActorHandleImpl {
sender: mpsc::Sender<DumpMsg>,
}
#[async_trait::async_trait]
impl DumpActorHandle for DumpActorHandleImpl {
async fn create_dump(&self) -> Result<DumpInfo> {
let (ret, receiver) = oneshot::channel();
let msg = DumpMsg::CreateDump { ret };
let _ = self.sender.send(msg).await;
receiver.await.expect("IndexActor has been killed")
}
async fn dump_info(&self, uid: String) -> Result<DumpInfo> {
let (ret, receiver) = oneshot::channel();
let msg = DumpMsg::DumpInfo { ret, uid };
let _ = self.sender.send(msg).await;
receiver.await.expect("IndexActor has been killed")
}
}
impl DumpActorHandleImpl {
pub fn new(
path: impl AsRef<Path>,
uuid_resolver: crate::index_controller::uuid_resolver::UuidResolverHandleImpl,
update: crate::index_controller::update_actor::UpdateActorHandleImpl<Bytes>,
index_db_size: usize,
update_db_size: usize,
) -> anyhow::Result<Self> {
let (sender, receiver) = mpsc::channel(10);
let actor = DumpActor::new(
receiver,
uuid_resolver,
update,
path,
index_db_size,
update_db_size,
);
tokio::task::spawn(actor.run());
Ok(Self { sender })
}
}

View File

@@ -1,2 +0,0 @@
pub mod v1;
pub mod v2;

View File

@@ -1,228 +0,0 @@
use std::collections::{BTreeMap, BTreeSet};
use std::fs::{create_dir_all, File};
use std::io::BufRead;
use std::marker::PhantomData;
use std::path::Path;
use std::sync::Arc;
use heed::EnvOpenOptions;
use log::{error, info, warn};
use milli::update::{IndexDocumentsMethod, Setting, UpdateFormat};
use serde::{Deserialize, Deserializer, Serialize};
use uuid::Uuid;
use crate::index_controller::{self, uuid_resolver::HeedUuidStore, IndexMetadata};
use crate::index_controller::{asc_ranking_rule, desc_ranking_rule};
use crate::{
index::{update_handler::UpdateHandler, Index, Unchecked},
option::IndexerOpts,
};
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct MetadataV1 {
db_version: String,
indexes: Vec<IndexMetadata>,
}
impl MetadataV1 {
pub fn load_dump(
self,
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
size: usize,
indexer_options: &IndexerOpts,
) -> anyhow::Result<()> {
info!(
"Loading dump, dump database version: {}, dump version: V1",
self.db_version
);
let uuid_store = HeedUuidStore::new(&dst)?;
for index in self.indexes {
let uuid = Uuid::new_v4();
uuid_store.insert(index.uid.clone(), uuid)?;
let src = src.as_ref().join(index.uid);
load_index(
&src,
&dst,
uuid,
index.meta.primary_key.as_deref(),
size,
indexer_options,
)?;
}
Ok(())
}
}
pub fn deserialize_some<'de, T, D>(deserializer: D) -> std::result::Result<Option<T>, D::Error>
where
T: Deserialize<'de>,
D: Deserializer<'de>,
{
Deserialize::deserialize(deserializer).map(Some)
}
// These are the settings used in legacy meilisearch (<v0.21.0).
#[derive(Default, Clone, Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
struct Settings {
#[serde(default, deserialize_with = "deserialize_some")]
pub ranking_rules: Option<Option<Vec<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub distinct_attribute: Option<Option<String>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub searchable_attributes: Option<Option<Vec<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub displayed_attributes: Option<Option<BTreeSet<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub stop_words: Option<Option<BTreeSet<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub synonyms: Option<Option<BTreeMap<String, Vec<String>>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub attributes_for_faceting: Option<Option<Vec<String>>>,
}
fn load_index(
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
uuid: Uuid,
primary_key: Option<&str>,
size: usize,
indexer_options: &IndexerOpts,
) -> anyhow::Result<()> {
let index_path = dst.as_ref().join(&format!("indexes/index-{}", uuid));
create_dir_all(&index_path)?;
let mut options = EnvOpenOptions::new();
options.map_size(size);
let index = milli::Index::new(options, index_path)?;
let index = Index(Arc::new(index));
// extract `settings.json` file and import content
let settings = import_settings(&src)?;
let settings: index_controller::Settings<Unchecked> = settings.into();
let mut txn = index.write_txn()?;
let handler = UpdateHandler::new(indexer_options)?;
index.update_settings_txn(&mut txn, &settings.check(), handler.update_builder(0))?;
let file = File::open(&src.as_ref().join("documents.jsonl"))?;
let mut reader = std::io::BufReader::new(file);
reader.fill_buf()?;
if !reader.buffer().is_empty() {
index.update_documents_txn(
&mut txn,
UpdateFormat::JsonStream,
IndexDocumentsMethod::ReplaceDocuments,
Some(reader),
handler.update_builder(0),
primary_key,
)?;
}
txn.commit()?;
// Finaly, we extract the original milli::Index and close it
Arc::try_unwrap(index.0)
.map_err(|_e| "Couldn't close the index properly")
.unwrap()
.prepare_for_closing()
.wait();
// Updates are ignored in dumps V1.
Ok(())
}
/// we need to **always** be able to convert the old settings to the settings currently being used
impl From<Settings> for index_controller::Settings<Unchecked> {
fn from(settings: Settings) -> Self {
Self {
distinct_attribute: match settings.distinct_attribute {
Some(Some(attr)) => Setting::Set(attr),
Some(None) => Setting::Reset,
None => Setting::NotSet
},
// we need to convert the old `Vec<String>` into a `BTreeSet<String>`
displayed_attributes: match settings.displayed_attributes {
Some(Some(attrs)) => Setting::Set(attrs.into_iter().collect()),
Some(None) => Setting::Reset,
None => Setting::NotSet
},
searchable_attributes: match settings.searchable_attributes {
Some(Some(attrs)) => Setting::Set(attrs),
Some(None) => Setting::Reset,
None => Setting::NotSet
},
filterable_attributes: match settings.attributes_for_faceting {
Some(Some(attrs)) => Setting::Set(attrs.into_iter().collect()),
Some(None) => Setting::Reset,
None => Setting::NotSet
},
sortable_attributes: Setting::NotSet,
ranking_rules: match settings.ranking_rules {
Some(Some(ranking_rules)) => Setting::Set(ranking_rules.into_iter().filter_map(|criterion| {
match criterion.as_str() {
"words" | "typo" | "proximity" | "attribute" | "exactness" => Some(criterion),
s if s.starts_with("asc") => asc_ranking_rule(s).map(|f| format!("{}:asc", f)),
s if s.starts_with("desc") => desc_ranking_rule(s).map(|f| format!("{}:desc", f)),
"wordsPosition" => {
warn!("The criteria `attribute` and `wordsPosition` have been merged \
into a single criterion `attribute` so `wordsPositon` will be \
ignored");
None
}
s => {
error!("Unknown criterion found in the dump: `{}`, it will be ignored", s);
None
}
}
}).collect()),
Some(None) => Setting::Reset,
None => Setting::NotSet
},
// we need to convert the old `Vec<String>` into a `BTreeSet<String>`
stop_words: match settings.stop_words {
Some(Some(stop_words)) => Setting::Set(stop_words.into_iter().collect()),
Some(None) => Setting::Reset,
None => Setting::NotSet
},
// we need to convert the old `Vec<String>` into a `BTreeMap<String>`
synonyms: match settings.synonyms {
Some(Some(synonyms)) => Setting::Set(synonyms.into_iter().collect()),
Some(None) => Setting::Reset,
None => Setting::NotSet
},
_kind: PhantomData,
}
}
}
/// Extract Settings from `settings.json` file present at provided `dir_path`
fn import_settings(dir_path: impl AsRef<Path>) -> anyhow::Result<Settings> {
let path = dir_path.as_ref().join("settings.json");
let file = File::open(path)?;
let reader = std::io::BufReader::new(file);
let metadata = serde_json::from_reader(reader)?;
Ok(metadata)
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn settings_format_regression() {
let settings = Settings::default();
assert_eq!(
r##"{"rankingRules":null,"distinctAttribute":null,"searchableAttributes":null,"displayedAttributes":null,"stopWords":null,"synonyms":null,"attributesForFaceting":null}"##,
serde_json::to_string(&settings).unwrap()
);
}
}

View File

@@ -1,59 +0,0 @@
use std::path::Path;
use chrono::{DateTime, Utc};
use log::info;
use serde::{Deserialize, Serialize};
use crate::index::Index;
use crate::index_controller::{update_actor::UpdateStore, uuid_resolver::HeedUuidStore};
use crate::option::IndexerOpts;
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct MetadataV2 {
db_version: String,
index_db_size: usize,
update_db_size: usize,
dump_date: DateTime<Utc>,
}
impl MetadataV2 {
pub fn new(index_db_size: usize, update_db_size: usize) -> Self {
Self {
db_version: env!("CARGO_PKG_VERSION").to_string(),
index_db_size,
update_db_size,
dump_date: Utc::now(),
}
}
pub fn load_dump(
self,
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
index_db_size: usize,
update_db_size: usize,
indexing_options: &IndexerOpts,
) -> anyhow::Result<()> {
info!(
"Loading dump from {}, dump database version: {}, dump version: V2",
self.dump_date, self.db_version
);
info!("Loading index database.");
HeedUuidStore::load_dump(src.as_ref(), &dst)?;
info!("Loading updates.");
UpdateStore::load_dump(&src, &dst, update_db_size)?;
info!("Loading indexes.");
let indexes_path = src.as_ref().join("indexes");
let indexes = indexes_path.read_dir()?;
for index in indexes {
let index = index?;
Index::load_dump(&index.path(), &dst, index_db_size, indexing_options)?;
}
Ok(())
}
}

View File

@@ -1,203 +0,0 @@
use std::fs::File;
use std::path::{Path, PathBuf};
use anyhow::Context;
use chrono::{DateTime, Utc};
use log::{info, trace, warn};
#[cfg(test)]
use mockall::automock;
use serde::{Deserialize, Serialize};
use tokio::fs::create_dir_all;
use loaders::v1::MetadataV1;
use loaders::v2::MetadataV2;
pub use actor::DumpActor;
pub use handle_impl::*;
pub use message::DumpMsg;
use super::{update_actor::UpdateActorHandle, uuid_resolver::UuidResolverHandle};
use crate::index_controller::dump_actor::error::DumpActorError;
use crate::{helpers::compression, option::IndexerOpts};
use error::Result;
mod actor;
pub mod error;
mod handle_impl;
mod loaders;
mod message;
const META_FILE_NAME: &str = "metadata.json";
#[async_trait::async_trait]
#[cfg_attr(test, automock)]
pub trait DumpActorHandle {
/// Start the creation of a dump
/// Implementation: [handle_impl::DumpActorHandleImpl::create_dump]
async fn create_dump(&self) -> Result<DumpInfo>;
/// Return the status of an already created dump
/// Implementation: [handle_impl::DumpActorHandleImpl::dump_info]
async fn dump_info(&self, uid: String) -> Result<DumpInfo>;
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "dumpVersion")]
pub enum Metadata {
V1(MetadataV1),
V2(MetadataV2),
}
impl Metadata {
pub fn new_v2(index_db_size: usize, update_db_size: usize) -> Self {
let meta = MetadataV2::new(index_db_size, update_db_size);
Self::V2(meta)
}
}
#[derive(Debug, Serialize, Deserialize, PartialEq, Clone)]
#[serde(rename_all = "snake_case")]
pub enum DumpStatus {
Done,
InProgress,
Failed,
}
#[derive(Debug, Serialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct DumpInfo {
pub uid: String,
pub status: DumpStatus,
#[serde(skip_serializing_if = "Option::is_none")]
pub error: Option<String>,
started_at: DateTime<Utc>,
#[serde(skip_serializing_if = "Option::is_none")]
finished_at: Option<DateTime<Utc>>,
}
impl DumpInfo {
pub fn new(uid: String, status: DumpStatus) -> Self {
Self {
uid,
status,
error: None,
started_at: Utc::now(),
finished_at: None,
}
}
pub fn with_error(&mut self, error: String) {
self.status = DumpStatus::Failed;
self.finished_at = Some(Utc::now());
self.error = Some(error);
}
pub fn done(&mut self) {
self.finished_at = Some(Utc::now());
self.status = DumpStatus::Done;
}
pub fn dump_already_in_progress(&self) -> bool {
self.status == DumpStatus::InProgress
}
}
pub fn load_dump(
dst_path: impl AsRef<Path>,
src_path: impl AsRef<Path>,
index_db_size: usize,
update_db_size: usize,
indexer_opts: &IndexerOpts,
) -> anyhow::Result<()> {
let tmp_src = tempfile::tempdir_in(".")?;
let tmp_src_path = tmp_src.path();
compression::from_tar_gz(&src_path, tmp_src_path)?;
let meta_path = tmp_src_path.join(META_FILE_NAME);
let mut meta_file = File::open(&meta_path)?;
let meta: Metadata = serde_json::from_reader(&mut meta_file)?;
let dst_dir = dst_path
.as_ref()
.parent()
.with_context(|| format!("Invalid db path: {}", dst_path.as_ref().display()))?;
let tmp_dst = tempfile::tempdir_in(dst_dir)?;
match meta {
Metadata::V1(meta) => {
meta.load_dump(&tmp_src_path, tmp_dst.path(), index_db_size, indexer_opts)?
}
Metadata::V2(meta) => meta.load_dump(
&tmp_src_path,
tmp_dst.path(),
index_db_size,
update_db_size,
indexer_opts,
)?,
}
// Persist and atomically rename the db
let persisted_dump = tmp_dst.into_path();
if dst_path.as_ref().exists() {
warn!("Overwriting database at {}", dst_path.as_ref().display());
std::fs::remove_dir_all(&dst_path)?;
}
std::fs::rename(&persisted_dump, &dst_path)?;
Ok(())
}
struct DumpTask<U, P> {
path: PathBuf,
uuid_resolver: U,
update_handle: P,
uid: String,
update_db_size: usize,
index_db_size: usize,
}
impl<U, P> DumpTask<U, P>
where
U: UuidResolverHandle + Send + Sync + Clone + 'static,
P: UpdateActorHandle + Send + Sync + Clone + 'static,
{
async fn run(self) -> Result<()> {
trace!("Performing dump.");
create_dir_all(&self.path).await?;
let path_clone = self.path.clone();
let temp_dump_dir =
tokio::task::spawn_blocking(|| tempfile::TempDir::new_in(path_clone)).await??;
let temp_dump_path = temp_dump_dir.path().to_owned();
let meta = Metadata::new_v2(self.index_db_size, self.update_db_size);
let meta_path = temp_dump_path.join(META_FILE_NAME);
let mut meta_file = File::create(&meta_path)?;
serde_json::to_writer(&mut meta_file, &meta)?;
let uuids = self.uuid_resolver.dump(temp_dump_path.clone()).await?;
self.update_handle
.dump(uuids, temp_dump_path.clone())
.await?;
let dump_path = tokio::task::spawn_blocking(move || -> Result<PathBuf> {
let temp_dump_file = tempfile::NamedTempFile::new_in(&self.path)?;
compression::to_tar_gz(temp_dump_path, temp_dump_file.path())
.map_err(|e| DumpActorError::Internal(e.into()))?;
let dump_path = self.path.join(self.uid).with_extension("dump");
temp_dump_file.persist(&dump_path)?;
Ok(dump_path)
})
.await??;
info!("Created dump in {:?}.", dump_path);
Ok(())
}
}

View File

@@ -1,40 +0,0 @@
use meilisearch_error::Code;
use meilisearch_error::ErrorCode;
use crate::index::error::IndexError;
use super::dump_actor::error::DumpActorError;
use super::index_actor::error::IndexActorError;
use super::update_actor::error::UpdateActorError;
use super::uuid_resolver::error::UuidResolverError;
pub type Result<T> = std::result::Result<T, IndexControllerError>;
#[derive(Debug, thiserror::Error)]
pub enum IndexControllerError {
#[error("Index creation must have an uid")]
MissingUid,
#[error("{0}")]
Uuid(#[from] UuidResolverError),
#[error("{0}")]
IndexActor(#[from] IndexActorError),
#[error("{0}")]
UpdateActor(#[from] UpdateActorError),
#[error("{0}")]
DumpActor(#[from] DumpActorError),
#[error("{0}")]
IndexError(#[from] IndexError),
}
impl ErrorCode for IndexControllerError {
fn error_code(&self) -> Code {
match self {
IndexControllerError::MissingUid => Code::BadRequest,
IndexControllerError::Uuid(e) => e.error_code(),
IndexControllerError::IndexActor(e) => e.error_code(),
IndexControllerError::UpdateActor(e) => e.error_code(),
IndexControllerError::DumpActor(e) => e.error_code(),
IndexControllerError::IndexError(e) => e.error_code(),
}
}
}

View File

@@ -1,351 +0,0 @@
use std::fs::File;
use std::path::PathBuf;
use std::sync::Arc;
use async_stream::stream;
use futures::stream::StreamExt;
use heed::CompactionOption;
use log::debug;
use milli::update::UpdateBuilder;
use tokio::task::spawn_blocking;
use tokio::{fs, sync::mpsc};
use uuid::Uuid;
use crate::index::{
update_handler::UpdateHandler, Checked, Document, SearchQuery, SearchResult, Settings,
};
use crate::index_controller::{
get_arc_ownership_blocking, Failed, IndexStats, Processed, Processing,
};
use crate::option::IndexerOpts;
use super::error::{IndexActorError, Result};
use super::{IndexMeta, IndexMsg, IndexSettings, IndexStore};
pub const CONCURRENT_INDEX_MSG: usize = 10;
pub struct IndexActor<S> {
receiver: Option<mpsc::Receiver<IndexMsg>>,
update_handler: Arc<UpdateHandler>,
store: S,
}
impl<S: IndexStore + Sync + Send> IndexActor<S> {
pub fn new(
receiver: mpsc::Receiver<IndexMsg>,
store: S,
options: &IndexerOpts,
) -> anyhow::Result<Self> {
let update_handler = UpdateHandler::new(options)?;
let update_handler = Arc::new(update_handler);
let receiver = Some(receiver);
Ok(Self {
receiver,
update_handler,
store,
})
}
/// `run` poll the write_receiver and read_receiver concurrently, but while messages send
/// through the read channel are processed concurrently, the messages sent through the write
/// channel are processed one at a time.
pub async fn run(mut self) {
let mut receiver = self
.receiver
.take()
.expect("Index Actor must have a inbox at this point.");
let stream = stream! {
loop {
match receiver.recv().await {
Some(msg) => yield msg,
None => break,
}
}
};
stream
.for_each_concurrent(Some(CONCURRENT_INDEX_MSG), |msg| self.handle_message(msg))
.await;
}
async fn handle_message(&self, msg: IndexMsg) {
use IndexMsg::*;
match msg {
CreateIndex {
uuid,
primary_key,
ret,
} => {
let _ = ret.send(self.handle_create_index(uuid, primary_key).await);
}
Update {
ret,
meta,
data,
uuid,
} => {
let _ = ret.send(self.handle_update(uuid, meta, data).await);
}
Search { ret, query, uuid } => {
let _ = ret.send(self.handle_search(uuid, query).await);
}
Settings { ret, uuid } => {
let _ = ret.send(self.handle_settings(uuid).await);
}
Documents {
ret,
uuid,
attributes_to_retrieve,
offset,
limit,
} => {
let _ = ret.send(
self.handle_fetch_documents(uuid, offset, limit, attributes_to_retrieve)
.await,
);
}
Document {
uuid,
attributes_to_retrieve,
doc_id,
ret,
} => {
let _ = ret.send(
self.handle_fetch_document(uuid, doc_id, attributes_to_retrieve)
.await,
);
}
Delete { uuid, ret } => {
let _ = ret.send(self.handle_delete(uuid).await);
}
GetMeta { uuid, ret } => {
let _ = ret.send(self.handle_get_meta(uuid).await);
}
UpdateIndex {
uuid,
index_settings,
ret,
} => {
let _ = ret.send(self.handle_update_index(uuid, index_settings).await);
}
Snapshot { uuid, path, ret } => {
let _ = ret.send(self.handle_snapshot(uuid, path).await);
}
Dump { uuid, path, ret } => {
let _ = ret.send(self.handle_dump(uuid, path).await);
}
GetStats { uuid, ret } => {
let _ = ret.send(self.handle_get_stats(uuid).await);
}
}
}
async fn handle_search(&self, uuid: Uuid, query: SearchQuery) -> Result<SearchResult> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result = spawn_blocking(move || index.perform_search(query)).await??;
Ok(result)
}
async fn handle_create_index(
&self,
uuid: Uuid,
primary_key: Option<String>,
) -> Result<IndexMeta> {
let index = self.store.create(uuid, primary_key).await?;
let meta = spawn_blocking(move || IndexMeta::new(&index)).await??;
Ok(meta)
}
async fn handle_update(
&self,
uuid: Uuid,
meta: Processing,
data: Option<File>,
) -> Result<std::result::Result<Processed, Failed>> {
debug!("Processing update {}", meta.id());
let update_handler = self.update_handler.clone();
let index = match self.store.get(uuid).await? {
Some(index) => index,
None => self.store.create(uuid, None).await?,
};
Ok(spawn_blocking(move || update_handler.handle_update(meta, data, index)).await?)
}
async fn handle_settings(&self, uuid: Uuid) -> Result<Settings<Checked>> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result = spawn_blocking(move || index.settings()).await??;
Ok(result)
}
async fn handle_fetch_documents(
&self,
uuid: Uuid,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result =
spawn_blocking(move || index.retrieve_documents(offset, limit, attributes_to_retrieve))
.await??;
Ok(result)
}
async fn handle_fetch_document(
&self,
uuid: Uuid,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result =
spawn_blocking(move || index.retrieve_document(doc_id, attributes_to_retrieve))
.await??;
Ok(result)
}
async fn handle_delete(&self, uuid: Uuid) -> Result<()> {
let index = self.store.delete(uuid).await?;
if let Some(index) = index {
tokio::task::spawn(async move {
let index = index.0;
let store = get_arc_ownership_blocking(index).await;
spawn_blocking(move || {
store.prepare_for_closing().wait();
debug!("Index closed");
});
});
}
Ok(())
}
async fn handle_get_meta(&self, uuid: Uuid) -> Result<IndexMeta> {
match self.store.get(uuid).await? {
Some(index) => {
let meta = spawn_blocking(move || IndexMeta::new(&index)).await??;
Ok(meta)
}
None => Err(IndexActorError::UnexistingIndex),
}
}
async fn handle_update_index(
&self,
uuid: Uuid,
index_settings: IndexSettings,
) -> Result<IndexMeta> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result = spawn_blocking(move || match index_settings.primary_key {
Some(primary_key) => {
let mut txn = index.write_txn()?;
if index.primary_key(&txn)?.is_some() {
return Err(IndexActorError::ExistingPrimaryKey);
}
let mut builder = UpdateBuilder::new(0).settings(&mut txn, &index);
builder.set_primary_key(primary_key);
builder.execute(|_, _| ())?;
let meta = IndexMeta::new_txn(&index, &txn)?;
txn.commit()?;
Ok(meta)
}
None => {
let meta = IndexMeta::new(&index)?;
Ok(meta)
}
})
.await??;
Ok(result)
}
async fn handle_snapshot(&self, uuid: Uuid, mut path: PathBuf) -> Result<()> {
use tokio::fs::create_dir_all;
path.push("indexes");
create_dir_all(&path).await?;
if let Some(index) = self.store.get(uuid).await? {
let mut index_path = path.join(format!("index-{}", uuid));
create_dir_all(&index_path).await?;
index_path.push("data.mdb");
spawn_blocking(move || -> Result<()> {
// Get write txn to wait for ongoing write transaction before snapshot.
let _txn = index.write_txn()?;
index
.env
.copy_to_path(index_path, CompactionOption::Enabled)?;
Ok(())
})
.await??;
}
Ok(())
}
/// Create a `documents.jsonl` and a `settings.json` in `path/uid/` with a dump of all the
/// documents and all the settings.
async fn handle_dump(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let path = path.join(format!("indexes/index-{}/", uuid));
fs::create_dir_all(&path).await?;
tokio::task::spawn_blocking(move || index.dump(path)).await??;
Ok(())
}
async fn handle_get_stats(&self, uuid: Uuid) -> Result<IndexStats> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
spawn_blocking(move || {
let rtxn = index.read_txn()?;
Ok(IndexStats {
size: index.size(),
number_of_documents: index.number_of_documents(&rtxn)?,
is_indexing: None,
field_distribution: index.field_distribution(&rtxn)?,
})
})
.await?
}
}

View File

@@ -1,48 +0,0 @@
use meilisearch_error::{Code, ErrorCode};
use crate::{error::MilliError, index::error::IndexError};
pub type Result<T> = std::result::Result<T, IndexActorError>;
#[derive(thiserror::Error, Debug)]
pub enum IndexActorError {
#[error("{0}")]
IndexError(#[from] IndexError),
#[error("Index already exists")]
IndexAlreadyExists,
#[error("Index not found")]
UnexistingIndex,
#[error("A primary key is already present. It's impossible to update it")]
ExistingPrimaryKey,
#[error("Internal Error: {0}")]
Internal(Box<dyn std::error::Error + Send + Sync + 'static>),
#[error("{0}")]
Milli(#[from] milli::Error),
}
macro_rules! internal_error {
($($other:path), *) => {
$(
impl From<$other> for IndexActorError {
fn from(other: $other) -> Self {
Self::Internal(Box::new(other))
}
}
)*
}
}
internal_error!(heed::Error, tokio::task::JoinError, std::io::Error);
impl ErrorCode for IndexActorError {
fn error_code(&self) -> Code {
match self {
IndexActorError::IndexError(e) => e.error_code(),
IndexActorError::IndexAlreadyExists => Code::IndexAlreadyExists,
IndexActorError::UnexistingIndex => Code::IndexNotFound,
IndexActorError::ExistingPrimaryKey => Code::PrimaryKeyAlreadyPresent,
IndexActorError::Internal(_) => Code::Internal,
IndexActorError::Milli(e) => MilliError(e).error_code(),
}
}
}

View File

@@ -1,164 +0,0 @@
use crate::option::IndexerOpts;
use std::path::{Path, PathBuf};
use tokio::sync::{mpsc, oneshot};
use uuid::Uuid;
use crate::{
index::Checked,
index_controller::{IndexSettings, IndexStats, Processing},
};
use crate::{
index::{Document, SearchQuery, SearchResult, Settings},
index_controller::{Failed, Processed},
};
use super::error::Result;
use super::{IndexActor, IndexActorHandle, IndexMeta, IndexMsg, MapIndexStore};
#[derive(Clone)]
pub struct IndexActorHandleImpl {
sender: mpsc::Sender<IndexMsg>,
}
#[async_trait::async_trait]
impl IndexActorHandle for IndexActorHandleImpl {
async fn create_index(&self, uuid: Uuid, primary_key: Option<String>) -> Result<IndexMeta> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::CreateIndex {
ret,
uuid,
primary_key,
};
let _ = self.sender.send(msg).await;
receiver.await.expect("IndexActor has been killed")
}
async fn update(
&self,
uuid: Uuid,
meta: Processing,
data: Option<std::fs::File>,
) -> Result<std::result::Result<Processed, Failed>> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Update {
ret,
meta,
data,
uuid,
};
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn search(&self, uuid: Uuid, query: SearchQuery) -> Result<SearchResult> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Search { uuid, query, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn settings(&self, uuid: Uuid) -> Result<Settings<Checked>> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Settings { uuid, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn documents(
&self,
uuid: Uuid,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Documents {
uuid,
ret,
offset,
attributes_to_retrieve,
limit,
};
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn document(
&self,
uuid: Uuid,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Document {
uuid,
ret,
doc_id,
attributes_to_retrieve,
};
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn delete(&self, uuid: Uuid) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Delete { uuid, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn get_index_meta(&self, uuid: Uuid) -> Result<IndexMeta> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::GetMeta { uuid, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn update_index(&self, uuid: Uuid, index_settings: IndexSettings) -> Result<IndexMeta> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::UpdateIndex {
uuid,
index_settings,
ret,
};
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn snapshot(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Snapshot { uuid, path, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn dump(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Dump { uuid, path, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn get_index_stats(&self, uuid: Uuid) -> Result<IndexStats> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::GetStats { uuid, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
}
impl IndexActorHandleImpl {
pub fn new(
path: impl AsRef<Path>,
index_size: usize,
options: &IndexerOpts,
) -> anyhow::Result<Self> {
let (sender, receiver) = mpsc::channel(100);
let store = MapIndexStore::new(path, index_size);
let actor = IndexActor::new(receiver, store, options)?;
tokio::task::spawn(actor.run());
Ok(Self { sender })
}
}

View File

@@ -1,74 +0,0 @@
use std::path::PathBuf;
use tokio::sync::oneshot;
use uuid::Uuid;
use super::error::Result as IndexResult;
use crate::index::{Checked, Document, SearchQuery, SearchResult, Settings};
use crate::index_controller::{Failed, IndexStats, Processed, Processing};
use super::{IndexMeta, IndexSettings};
#[allow(clippy::large_enum_variant)]
pub enum IndexMsg {
CreateIndex {
uuid: Uuid,
primary_key: Option<String>,
ret: oneshot::Sender<IndexResult<IndexMeta>>,
},
Update {
uuid: Uuid,
meta: Processing,
data: Option<std::fs::File>,
ret: oneshot::Sender<IndexResult<Result<Processed, Failed>>>,
},
Search {
uuid: Uuid,
query: SearchQuery,
ret: oneshot::Sender<IndexResult<SearchResult>>,
},
Settings {
uuid: Uuid,
ret: oneshot::Sender<IndexResult<Settings<Checked>>>,
},
Documents {
uuid: Uuid,
attributes_to_retrieve: Option<Vec<String>>,
offset: usize,
limit: usize,
ret: oneshot::Sender<IndexResult<Vec<Document>>>,
},
Document {
uuid: Uuid,
attributes_to_retrieve: Option<Vec<String>>,
doc_id: String,
ret: oneshot::Sender<IndexResult<Document>>,
},
Delete {
uuid: Uuid,
ret: oneshot::Sender<IndexResult<()>>,
},
GetMeta {
uuid: Uuid,
ret: oneshot::Sender<IndexResult<IndexMeta>>,
},
UpdateIndex {
uuid: Uuid,
index_settings: IndexSettings,
ret: oneshot::Sender<IndexResult<IndexMeta>>,
},
Snapshot {
uuid: Uuid,
path: PathBuf,
ret: oneshot::Sender<IndexResult<()>>,
},
Dump {
uuid: Uuid,
path: PathBuf,
ret: oneshot::Sender<IndexResult<()>>,
},
GetStats {
uuid: Uuid,
ret: oneshot::Sender<IndexResult<IndexStats>>,
},
}

View File

@@ -1,169 +0,0 @@
use std::fs::File;
use std::path::PathBuf;
use chrono::{DateTime, Utc};
#[cfg(test)]
use mockall::automock;
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use actor::IndexActor;
pub use actor::CONCURRENT_INDEX_MSG;
pub use handle_impl::IndexActorHandleImpl;
use message::IndexMsg;
use store::{IndexStore, MapIndexStore};
use crate::index::{Checked, Document, Index, SearchQuery, SearchResult, Settings};
use crate::index_controller::{Failed, IndexStats, Processed, Processing};
use error::Result;
use super::IndexSettings;
mod actor;
pub mod error;
mod handle_impl;
mod message;
mod store;
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct IndexMeta {
created_at: DateTime<Utc>,
pub updated_at: DateTime<Utc>,
pub primary_key: Option<String>,
}
impl IndexMeta {
fn new(index: &Index) -> Result<Self> {
let txn = index.read_txn()?;
Self::new_txn(index, &txn)
}
fn new_txn(index: &Index, txn: &heed::RoTxn) -> Result<Self> {
let created_at = index.created_at(txn)?;
let updated_at = index.updated_at(txn)?;
let primary_key = index.primary_key(txn)?.map(String::from);
Ok(Self {
created_at,
updated_at,
primary_key,
})
}
}
#[async_trait::async_trait]
#[cfg_attr(test, automock)]
pub trait IndexActorHandle {
async fn create_index(&self, uuid: Uuid, primary_key: Option<String>) -> Result<IndexMeta>;
async fn update(
&self,
uuid: Uuid,
meta: Processing,
data: Option<File>,
) -> Result<std::result::Result<Processed, Failed>>;
async fn search(&self, uuid: Uuid, query: SearchQuery) -> Result<SearchResult>;
async fn settings(&self, uuid: Uuid) -> Result<Settings<Checked>>;
async fn documents(
&self,
uuid: Uuid,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>>;
async fn document(
&self,
uuid: Uuid,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document>;
async fn delete(&self, uuid: Uuid) -> Result<()>;
async fn get_index_meta(&self, uuid: Uuid) -> Result<IndexMeta>;
async fn update_index(&self, uuid: Uuid, index_settings: IndexSettings) -> Result<IndexMeta>;
async fn snapshot(&self, uuid: Uuid, path: PathBuf) -> Result<()>;
async fn dump(&self, uuid: Uuid, path: PathBuf) -> Result<()>;
async fn get_index_stats(&self, uuid: Uuid) -> Result<IndexStats>;
}
#[cfg(test)]
mod test {
use std::sync::Arc;
use super::*;
#[async_trait::async_trait]
/// Useful for passing around an `Arc<MockIndexActorHandle>` in tests.
impl IndexActorHandle for Arc<MockIndexActorHandle> {
async fn create_index(&self, uuid: Uuid, primary_key: Option<String>) -> Result<IndexMeta> {
self.as_ref().create_index(uuid, primary_key).await
}
async fn update(
&self,
uuid: Uuid,
meta: Processing,
data: Option<std::fs::File>,
) -> Result<std::result::Result<Processed, Failed>> {
self.as_ref().update(uuid, meta, data).await
}
async fn search(&self, uuid: Uuid, query: SearchQuery) -> Result<SearchResult> {
self.as_ref().search(uuid, query).await
}
async fn settings(&self, uuid: Uuid) -> Result<Settings<Checked>> {
self.as_ref().settings(uuid).await
}
async fn documents(
&self,
uuid: Uuid,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>> {
self.as_ref()
.documents(uuid, offset, limit, attributes_to_retrieve)
.await
}
async fn document(
&self,
uuid: Uuid,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document> {
self.as_ref()
.document(uuid, doc_id, attributes_to_retrieve)
.await
}
async fn delete(&self, uuid: Uuid) -> Result<()> {
self.as_ref().delete(uuid).await
}
async fn get_index_meta(&self, uuid: Uuid) -> Result<IndexMeta> {
self.as_ref().get_index_meta(uuid).await
}
async fn update_index(
&self,
uuid: Uuid,
index_settings: IndexSettings,
) -> Result<IndexMeta> {
self.as_ref().update_index(uuid, index_settings).await
}
async fn snapshot(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
self.as_ref().snapshot(uuid, path).await
}
async fn dump(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
self.as_ref().dump(uuid, path).await
}
async fn get_index_stats(&self, uuid: Uuid) -> Result<IndexStats> {
self.as_ref().get_index_stats(uuid).await
}
}
}

View File

@@ -1,456 +0,0 @@
use std::collections::BTreeMap;
use std::path::Path;
use std::sync::Arc;
use std::time::Duration;
use actix_web::web::Bytes;
use chrono::{DateTime, Utc};
use futures::stream::StreamExt;
use log::error;
use log::info;
use milli::FieldDistribution;
use serde::{Deserialize, Serialize};
use tokio::sync::mpsc;
use tokio::time::sleep;
use uuid::Uuid;
use dump_actor::DumpActorHandle;
pub use dump_actor::{DumpInfo, DumpStatus};
use index_actor::IndexActorHandle;
use snapshot::{load_snapshot, SnapshotService};
use update_actor::UpdateActorHandle;
pub use updates::*;
use uuid_resolver::{error::UuidResolverError, UuidResolverHandle};
use crate::extractors::payload::Payload;
use crate::index::{Checked, Document, SearchQuery, SearchResult, Settings};
use crate::option::Opt;
use error::Result;
use self::dump_actor::load_dump;
use self::error::IndexControllerError;
mod dump_actor;
pub mod error;
pub mod index_actor;
mod snapshot;
mod update_actor;
mod updates;
mod uuid_resolver;
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct IndexMetadata {
#[serde(skip)]
pub uuid: Uuid,
pub uid: String,
name: String,
#[serde(flatten)]
pub meta: index_actor::IndexMeta,
}
#[derive(Clone, Debug)]
pub struct IndexSettings {
pub uid: Option<String>,
pub primary_key: Option<String>,
}
#[derive(Serialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct IndexStats {
#[serde(skip)]
pub size: u64,
pub number_of_documents: u64,
/// Whether the current index is performing an update. It is initially `None` when the
/// index returns it, since it is the `UpdateStore` that knows what index is currently indexing. It is
/// later set to either true or false, we we retrieve the information from the `UpdateStore`
pub is_indexing: Option<bool>,
pub field_distribution: FieldDistribution,
}
#[derive(Clone)]
pub struct IndexController {
uuid_resolver: uuid_resolver::UuidResolverHandleImpl,
index_handle: index_actor::IndexActorHandleImpl,
update_handle: update_actor::UpdateActorHandleImpl<Bytes>,
dump_handle: dump_actor::DumpActorHandleImpl,
}
#[derive(Serialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Stats {
pub database_size: u64,
pub last_update: Option<DateTime<Utc>>,
pub indexes: BTreeMap<String, IndexStats>,
}
impl IndexController {
pub fn new(path: impl AsRef<Path>, options: &Opt) -> anyhow::Result<Self> {
let index_size = options.max_index_size.get_bytes() as usize;
let update_store_size = options.max_index_size.get_bytes() as usize;
if let Some(ref path) = options.import_snapshot {
info!("Loading from snapshot {:?}", path);
load_snapshot(
&options.db_path,
path,
options.ignore_snapshot_if_db_exists,
options.ignore_missing_snapshot,
)?;
} else if let Some(ref src_path) = options.import_dump {
load_dump(
&options.db_path,
src_path,
options.max_index_size.get_bytes() as usize,
options.max_udb_size.get_bytes() as usize,
&options.indexer_options,
)?;
}
std::fs::create_dir_all(&path)?;
let uuid_resolver = uuid_resolver::UuidResolverHandleImpl::new(&path)?;
let index_handle =
index_actor::IndexActorHandleImpl::new(&path, index_size, &options.indexer_options)?;
let update_handle = update_actor::UpdateActorHandleImpl::new(
index_handle.clone(),
&path,
update_store_size,
)?;
let dump_handle = dump_actor::DumpActorHandleImpl::new(
&options.dumps_dir,
uuid_resolver.clone(),
update_handle.clone(),
options.max_index_size.get_bytes() as usize,
options.max_udb_size.get_bytes() as usize,
)?;
if options.schedule_snapshot {
let snapshot_service = SnapshotService::new(
uuid_resolver.clone(),
update_handle.clone(),
Duration::from_secs(options.snapshot_interval_sec),
options.snapshot_dir.clone(),
options
.db_path
.file_name()
.map(|n| n.to_owned().into_string().expect("invalid path"))
.unwrap_or_else(|| String::from("data.ms")),
);
tokio::task::spawn(snapshot_service.run());
}
Ok(Self {
uuid_resolver,
index_handle,
update_handle,
dump_handle,
})
}
pub async fn add_documents(
&self,
uid: String,
method: milli::update::IndexDocumentsMethod,
format: milli::update::UpdateFormat,
payload: Payload,
primary_key: Option<String>,
) -> Result<UpdateStatus> {
let perform_update = |uuid| async move {
let meta = UpdateMeta::DocumentsAddition {
method,
format,
primary_key,
};
let (sender, receiver) = mpsc::channel(10);
// It is necessary to spawn a local task to send the payload to the update handle to
// prevent dead_locking between the update_handle::update that waits for the update to be
// registered and the update_actor that waits for the the payload to be sent to it.
tokio::task::spawn_local(async move {
payload
.for_each(|r| async {
let _ = sender.send(r).await;
})
.await
});
// This must be done *AFTER* spawning the task.
self.update_handle.update(meta, receiver, uuid).await
};
match self.uuid_resolver.get(uid).await {
Ok(uuid) => Ok(perform_update(uuid).await?),
Err(UuidResolverError::UnexistingIndex(name)) => {
let uuid = Uuid::new_v4();
let status = perform_update(uuid).await?;
// ignore if index creation fails now, since it may already have been created
let _ = self.index_handle.create_index(uuid, None).await;
self.uuid_resolver.insert(name, uuid).await?;
Ok(status)
}
Err(e) => Err(e.into()),
}
}
pub async fn clear_documents(&self, uid: String) -> Result<UpdateStatus> {
let uuid = self.uuid_resolver.get(uid).await?;
let meta = UpdateMeta::ClearDocuments;
let (_, receiver) = mpsc::channel(1);
let status = self.update_handle.update(meta, receiver, uuid).await?;
Ok(status)
}
pub async fn delete_documents(
&self,
uid: String,
documents: Vec<String>,
) -> Result<UpdateStatus> {
let uuid = self.uuid_resolver.get(uid).await?;
let meta = UpdateMeta::DeleteDocuments { ids: documents };
let (_, receiver) = mpsc::channel(1);
let status = self.update_handle.update(meta, receiver, uuid).await?;
Ok(status)
}
pub async fn update_settings(
&self,
uid: String,
settings: Settings<Checked>,
create: bool,
) -> Result<UpdateStatus> {
let perform_udpate = |uuid| async move {
let meta = UpdateMeta::Settings(settings.into_unchecked());
// Nothing so send, drop the sender right away, as not to block the update actor.
let (_, receiver) = mpsc::channel(1);
self.update_handle.update(meta, receiver, uuid).await
};
match self.uuid_resolver.get(uid).await {
Ok(uuid) => Ok(perform_udpate(uuid).await?),
Err(UuidResolverError::UnexistingIndex(name)) if create => {
let uuid = Uuid::new_v4();
let status = perform_udpate(uuid).await?;
// ignore if index creation fails now, since it may already have been created
let _ = self.index_handle.create_index(uuid, None).await;
self.uuid_resolver.insert(name, uuid).await?;
Ok(status)
}
Err(e) => Err(e.into()),
}
}
pub async fn create_index(&self, index_settings: IndexSettings) -> Result<IndexMetadata> {
let IndexSettings { uid, primary_key } = index_settings;
let uid = uid.ok_or(IndexControllerError::MissingUid)?;
let uuid = Uuid::new_v4();
let meta = self.index_handle.create_index(uuid, primary_key).await?;
self.uuid_resolver.insert(uid.clone(), uuid).await?;
let meta = IndexMetadata {
uuid,
name: uid.clone(),
uid,
meta,
};
Ok(meta)
}
pub async fn delete_index(&self, uid: String) -> Result<()> {
let uuid = self.uuid_resolver.delete(uid).await?;
// We remove the index from the resolver synchronously, and effectively perform the index
// deletion as a background task.
let update_handle = self.update_handle.clone();
let index_handle = self.index_handle.clone();
tokio::spawn(async move {
if let Err(e) = update_handle.delete(uuid).await {
error!("Error while deleting index: {}", e);
}
if let Err(e) = index_handle.delete(uuid).await {
error!("Error while deleting index: {}", e);
}
});
Ok(())
}
pub async fn update_status(&self, uid: String, id: u64) -> Result<UpdateStatus> {
let uuid = self.uuid_resolver.get(uid).await?;
let result = self.update_handle.update_status(uuid, id).await?;
Ok(result)
}
pub async fn all_update_status(&self, uid: String) -> Result<Vec<UpdateStatus>> {
let uuid = self.uuid_resolver.get(uid).await?;
let result = self.update_handle.get_all_updates_status(uuid).await?;
Ok(result)
}
pub async fn list_indexes(&self) -> Result<Vec<IndexMetadata>> {
let uuids = self.uuid_resolver.list().await?;
let mut ret = Vec::new();
for (uid, uuid) in uuids {
let meta = self.index_handle.get_index_meta(uuid).await?;
let meta = IndexMetadata {
uuid,
name: uid.clone(),
uid,
meta,
};
ret.push(meta);
}
Ok(ret)
}
pub async fn settings(&self, uid: String) -> Result<Settings<Checked>> {
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let settings = self.index_handle.settings(uuid).await?;
Ok(settings)
}
pub async fn documents(
&self,
uid: String,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>> {
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let documents = self
.index_handle
.documents(uuid, offset, limit, attributes_to_retrieve)
.await?;
Ok(documents)
}
pub async fn document(
&self,
uid: String,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document> {
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let document = self
.index_handle
.document(uuid, doc_id, attributes_to_retrieve)
.await?;
Ok(document)
}
pub async fn update_index(
&self,
uid: String,
mut index_settings: IndexSettings,
) -> Result<IndexMetadata> {
if index_settings.uid.is_some() {
index_settings.uid.take();
}
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let meta = self.index_handle.update_index(uuid, index_settings).await?;
let meta = IndexMetadata {
uuid,
name: uid.clone(),
uid,
meta,
};
Ok(meta)
}
pub async fn search(&self, uid: String, query: SearchQuery) -> Result<SearchResult> {
let uuid = self.uuid_resolver.get(uid).await?;
let result = self.index_handle.search(uuid, query).await?;
Ok(result)
}
pub async fn get_index(&self, uid: String) -> Result<IndexMetadata> {
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let meta = self.index_handle.get_index_meta(uuid).await?;
let meta = IndexMetadata {
uuid,
name: uid.clone(),
uid,
meta,
};
Ok(meta)
}
pub async fn get_uuids_size(&self) -> Result<u64> {
Ok(self.uuid_resolver.get_size().await?)
}
pub async fn get_index_stats(&self, uid: String) -> Result<IndexStats> {
let uuid = self.uuid_resolver.get(uid).await?;
let update_infos = self.update_handle.get_info().await?;
let mut stats = self.index_handle.get_index_stats(uuid).await?;
// Check if the currently indexing update is from out index.
stats.is_indexing = Some(Some(uuid) == update_infos.processing);
Ok(stats)
}
pub async fn get_all_stats(&self) -> Result<Stats> {
let update_infos = self.update_handle.get_info().await?;
let mut database_size = self.get_uuids_size().await? + update_infos.size;
let mut last_update: Option<DateTime<_>> = None;
let mut indexes = BTreeMap::new();
for index in self.list_indexes().await? {
let mut index_stats = self.index_handle.get_index_stats(index.uuid).await?;
database_size += index_stats.size;
last_update = last_update.map_or(Some(index.meta.updated_at), |last| {
Some(last.max(index.meta.updated_at))
});
index_stats.is_indexing = Some(Some(index.uuid) == update_infos.processing);
indexes.insert(index.uid, index_stats);
}
Ok(Stats {
database_size,
last_update,
indexes,
})
}
pub async fn create_dump(&self) -> Result<DumpInfo> {
Ok(self.dump_handle.create_dump().await?)
}
pub async fn dump_info(&self, uid: String) -> Result<DumpInfo> {
Ok(self.dump_handle.dump_info(uid).await?)
}
}
pub async fn get_arc_ownership_blocking<T>(mut item: Arc<T>) -> T {
loop {
match Arc::try_unwrap(item) {
Ok(item) => return item,
Err(item_arc) => {
item = item_arc;
sleep(Duration::from_millis(100)).await;
continue;
}
}
}
}
/// Parses the v1 version of the Asc ranking rules `asc(price)`and returns the field name.
pub fn asc_ranking_rule(text: &str) -> Option<&str> {
text.split_once("asc(")
.and_then(|(_, tail)| tail.rsplit_once(")"))
.map(|(field, _)| field)
}
/// Parses the v1 version of the Desc ranking rules `asc(price)`and returns the field name.
pub fn desc_ranking_rule(text: &str) -> Option<&str> {
text.split_once("desc(")
.and_then(|(_, tail)| tail.rsplit_once(")"))
.map(|(field, _)| field)
}

View File

@@ -1,268 +0,0 @@
use std::path::{Path, PathBuf};
use std::time::Duration;
use anyhow::bail;
use log::{error, info, trace};
use tokio::fs;
use tokio::task::spawn_blocking;
use tokio::time::sleep;
use super::update_actor::UpdateActorHandle;
use super::uuid_resolver::UuidResolverHandle;
use crate::helpers::compression;
pub struct SnapshotService<U, R> {
uuid_resolver_handle: R,
update_handle: U,
snapshot_period: Duration,
snapshot_path: PathBuf,
db_name: String,
}
impl<U, R> SnapshotService<U, R>
where
U: UpdateActorHandle,
R: UuidResolverHandle,
{
pub fn new(
uuid_resolver_handle: R,
update_handle: U,
snapshot_period: Duration,
snapshot_path: PathBuf,
db_name: String,
) -> Self {
Self {
uuid_resolver_handle,
update_handle,
snapshot_period,
snapshot_path,
db_name,
}
}
pub async fn run(self) {
info!(
"Snapshot scheduled every {}s.",
self.snapshot_period.as_secs()
);
loop {
if let Err(e) = self.perform_snapshot().await {
error!("Error while performing snapshot: {}", e);
}
sleep(self.snapshot_period).await;
}
}
async fn perform_snapshot(&self) -> anyhow::Result<()> {
trace!("Performing snapshot.");
let snapshot_dir = self.snapshot_path.clone();
fs::create_dir_all(&snapshot_dir).await?;
let temp_snapshot_dir =
spawn_blocking(move || tempfile::tempdir_in(snapshot_dir)).await??;
let temp_snapshot_path = temp_snapshot_dir.path().to_owned();
let uuids = self
.uuid_resolver_handle
.snapshot(temp_snapshot_path.clone())
.await?;
if uuids.is_empty() {
return Ok(());
}
self.update_handle
.snapshot(uuids, temp_snapshot_path.clone())
.await?;
let snapshot_dir = self.snapshot_path.clone();
let snapshot_path = self
.snapshot_path
.join(format!("{}.snapshot", self.db_name));
let snapshot_path = spawn_blocking(move || -> anyhow::Result<PathBuf> {
let temp_snapshot_file = tempfile::NamedTempFile::new_in(snapshot_dir)?;
let temp_snapshot_file_path = temp_snapshot_file.path().to_owned();
compression::to_tar_gz(temp_snapshot_path, temp_snapshot_file_path)?;
temp_snapshot_file.persist(&snapshot_path)?;
Ok(snapshot_path)
})
.await??;
trace!("Created snapshot in {:?}.", snapshot_path);
Ok(())
}
}
pub fn load_snapshot(
db_path: impl AsRef<Path>,
snapshot_path: impl AsRef<Path>,
ignore_snapshot_if_db_exists: bool,
ignore_missing_snapshot: bool,
) -> anyhow::Result<()> {
if !db_path.as_ref().exists() && snapshot_path.as_ref().exists() {
match compression::from_tar_gz(snapshot_path, &db_path) {
Ok(()) => Ok(()),
Err(e) => {
// clean created db folder
std::fs::remove_dir_all(&db_path)?;
Err(e)
}
}
} else if db_path.as_ref().exists() && !ignore_snapshot_if_db_exists {
bail!(
"database already exists at {:?}, try to delete it or rename it",
db_path
.as_ref()
.canonicalize()
.unwrap_or_else(|_| db_path.as_ref().to_owned())
)
} else if !snapshot_path.as_ref().exists() && !ignore_missing_snapshot {
bail!(
"snapshot doesn't exist at {:?}",
snapshot_path
.as_ref()
.canonicalize()
.unwrap_or_else(|_| snapshot_path.as_ref().to_owned())
)
} else {
Ok(())
}
}
#[cfg(test)]
mod test {
use std::iter::FromIterator;
use std::{collections::HashSet, sync::Arc};
use futures::future::{err, ok};
use rand::Rng;
use tokio::time::timeout;
use uuid::Uuid;
use super::*;
use crate::index_controller::index_actor::MockIndexActorHandle;
use crate::index_controller::update_actor::{
error::UpdateActorError, MockUpdateActorHandle, UpdateActorHandleImpl,
};
use crate::index_controller::uuid_resolver::{
error::UuidResolverError, MockUuidResolverHandle,
};
#[actix_rt::test]
async fn test_normal() {
let mut rng = rand::thread_rng();
let uuids_num: usize = rng.gen_range(5, 10);
let uuids = (0..uuids_num)
.map(|_| Uuid::new_v4())
.collect::<HashSet<_>>();
let mut uuid_resolver = MockUuidResolverHandle::new();
let uuids_clone = uuids.clone();
uuid_resolver
.expect_snapshot()
.times(1)
.returning(move |_| Box::pin(ok(uuids_clone.clone())));
let uuids_clone = uuids.clone();
let mut index_handle = MockIndexActorHandle::new();
index_handle
.expect_snapshot()
.withf(move |uuid, _path| uuids_clone.contains(uuid))
.times(uuids_num)
.returning(move |_, _| Box::pin(ok(())));
let dir = tempfile::tempdir_in(".").unwrap();
let handle = Arc::new(index_handle);
let update_handle =
UpdateActorHandleImpl::<Vec<u8>>::new(handle.clone(), dir.path(), 4096 * 100).unwrap();
let snapshot_path = tempfile::tempdir_in(".").unwrap();
let snapshot_service = SnapshotService::new(
uuid_resolver,
update_handle,
Duration::from_millis(100),
snapshot_path.path().to_owned(),
"data.ms".to_string(),
);
snapshot_service.perform_snapshot().await.unwrap();
}
#[actix_rt::test]
async fn error_performing_uuid_snapshot() {
let mut uuid_resolver = MockUuidResolverHandle::new();
uuid_resolver
.expect_snapshot()
.times(1)
// abitrary error
.returning(|_| Box::pin(err(UuidResolverError::NameAlreadyExist)));
let update_handle = MockUpdateActorHandle::new();
let snapshot_path = tempfile::tempdir_in(".").unwrap();
let snapshot_service = SnapshotService::new(
uuid_resolver,
update_handle,
Duration::from_millis(100),
snapshot_path.path().to_owned(),
"data.ms".to_string(),
);
assert!(snapshot_service.perform_snapshot().await.is_err());
// Nothing was written to the file
assert!(!snapshot_path.path().join("data.ms.snapshot").exists());
}
#[actix_rt::test]
async fn error_performing_index_snapshot() {
let uuid = Uuid::new_v4();
let mut uuid_resolver = MockUuidResolverHandle::new();
uuid_resolver
.expect_snapshot()
.times(1)
.returning(move |_| Box::pin(ok(HashSet::from_iter(Some(uuid)))));
let mut update_handle = MockUpdateActorHandle::new();
update_handle
.expect_snapshot()
// abitrary error
.returning(|_, _| Box::pin(err(UpdateActorError::UnexistingUpdate(0))));
let snapshot_path = tempfile::tempdir_in(".").unwrap();
let snapshot_service = SnapshotService::new(
uuid_resolver,
update_handle,
Duration::from_millis(100),
snapshot_path.path().to_owned(),
"data.ms".to_string(),
);
assert!(snapshot_service.perform_snapshot().await.is_err());
// Nothing was written to the file
assert!(!snapshot_path.path().join("data.ms.snapshot").exists());
}
#[actix_rt::test]
async fn test_loop() {
let mut uuid_resolver = MockUuidResolverHandle::new();
uuid_resolver
.expect_snapshot()
// we expect the funtion to be called between 2 and 3 time in the given interval.
.times(2..4)
// abitrary error, to short-circuit the function
.returning(move |_| Box::pin(err(UuidResolverError::NameAlreadyExist)));
let update_handle = MockUpdateActorHandle::new();
let snapshot_path = tempfile::tempdir_in(".").unwrap();
let snapshot_service = SnapshotService::new(
uuid_resolver,
update_handle,
Duration::from_millis(100),
snapshot_path.path().to_owned(),
"data.ms".to_string(),
);
let _ = timeout(Duration::from_millis(300), snapshot_service.run()).await;
}
}

View File

@@ -1,270 +0,0 @@
use std::collections::HashSet;
use std::io::SeekFrom;
use std::path::{Path, PathBuf};
use std::sync::atomic::AtomicBool;
use std::sync::Arc;
use async_stream::stream;
use futures::StreamExt;
use log::trace;
use serdeval::*;
use tokio::fs;
use tokio::io::AsyncWriteExt;
use tokio::sync::mpsc;
use uuid::Uuid;
use super::error::{Result, UpdateActorError};
use super::{PayloadData, UpdateMsg, UpdateStore, UpdateStoreInfo};
use crate::index_controller::index_actor::IndexActorHandle;
use crate::index_controller::{UpdateMeta, UpdateStatus};
pub struct UpdateActor<D, I> {
path: PathBuf,
store: Arc<UpdateStore>,
inbox: Option<mpsc::Receiver<UpdateMsg<D>>>,
index_handle: I,
must_exit: Arc<AtomicBool>,
}
impl<D, I> UpdateActor<D, I>
where
D: AsRef<[u8]> + Sized + 'static,
I: IndexActorHandle + Clone + Send + Sync + 'static,
{
pub fn new(
update_db_size: usize,
inbox: mpsc::Receiver<UpdateMsg<D>>,
path: impl AsRef<Path>,
index_handle: I,
) -> anyhow::Result<Self> {
let path = path.as_ref().join("updates");
std::fs::create_dir_all(&path)?;
let mut options = heed::EnvOpenOptions::new();
options.map_size(update_db_size);
let must_exit = Arc::new(AtomicBool::new(false));
let store = UpdateStore::open(options, &path, index_handle.clone(), must_exit.clone())?;
std::fs::create_dir_all(path.join("update_files"))?;
let inbox = Some(inbox);
Ok(Self {
path,
store,
inbox,
index_handle,
must_exit,
})
}
pub async fn run(mut self) {
use UpdateMsg::*;
trace!("Started update actor.");
let mut inbox = self
.inbox
.take()
.expect("A receiver should be present by now.");
let must_exit = self.must_exit.clone();
let stream = stream! {
loop {
let msg = inbox.recv().await;
if must_exit.load(std::sync::atomic::Ordering::Relaxed) {
break;
}
match msg {
Some(msg) => yield msg,
None => break,
}
}
};
stream
.for_each_concurrent(Some(10), |msg| async {
match msg {
Update {
uuid,
meta,
data,
ret,
} => {
let _ = ret.send(self.handle_update(uuid, meta, data).await);
}
ListUpdates { uuid, ret } => {
let _ = ret.send(self.handle_list_updates(uuid).await);
}
GetUpdate { uuid, ret, id } => {
let _ = ret.send(self.handle_get_update(uuid, id).await);
}
Delete { uuid, ret } => {
let _ = ret.send(self.handle_delete(uuid).await);
}
Snapshot { uuids, path, ret } => {
let _ = ret.send(self.handle_snapshot(uuids, path).await);
}
GetInfo { ret } => {
let _ = ret.send(self.handle_get_info().await);
}
Dump { uuids, path, ret } => {
let _ = ret.send(self.handle_dump(uuids, path).await);
}
}
})
.await;
}
async fn handle_update(
&self,
uuid: Uuid,
meta: UpdateMeta,
payload: mpsc::Receiver<PayloadData<D>>,
) -> Result<UpdateStatus> {
let file_path = match meta {
UpdateMeta::DocumentsAddition { .. } => {
let update_file_id = uuid::Uuid::new_v4();
let path = self
.path
.join(format!("update_files/update_{}", update_file_id));
let mut file = fs::OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open(&path)
.await?;
async fn write_to_file<D>(
file: &mut fs::File,
mut payload: mpsc::Receiver<PayloadData<D>>,
) -> Result<usize>
where
D: AsRef<[u8]> + Sized + 'static,
{
let mut file_len = 0;
while let Some(bytes) = payload.recv().await {
let bytes = bytes?;
file_len += bytes.as_ref().len();
file.write_all(bytes.as_ref()).await?;
}
file.flush().await?;
Ok(file_len)
}
let file_len = write_to_file(&mut file, payload).await;
match file_len {
Ok(len) if len > 0 => {
let file = file.into_std().await;
Some((file, update_file_id))
}
Err(e) => {
fs::remove_file(&path).await?;
return Err(e);
}
_ => {
fs::remove_file(&path).await?;
None
}
}
}
_ => None,
};
let update_store = self.store.clone();
tokio::task::spawn_blocking(move || {
use std::io::{BufReader, Seek};
// If the payload is empty, ignore the check.
let update_uuid = if let Some((mut file, uuid)) = file_path {
// set the file back to the beginning
file.seek(SeekFrom::Start(0))?;
// Check that the json payload is valid:
let reader = BufReader::new(&mut file);
// Validate that the payload is in the correct format.
let _: Seq<Map<Str, Any>> = serde_json::from_reader(reader)
.map_err(|e| UpdateActorError::InvalidPayload(Box::new(e)))?;
Some(uuid)
} else {
None
};
// The payload is valid, we can register it to the update store.
let status = update_store
.register_update(meta, update_uuid, uuid)
.map(UpdateStatus::Enqueued)?;
Ok(status)
})
.await?
}
async fn handle_list_updates(&self, uuid: Uuid) -> Result<Vec<UpdateStatus>> {
let update_store = self.store.clone();
tokio::task::spawn_blocking(move || {
let result = update_store.list(uuid)?;
Ok(result)
})
.await?
}
async fn handle_get_update(&self, uuid: Uuid, id: u64) -> Result<UpdateStatus> {
let store = self.store.clone();
tokio::task::spawn_blocking(move || {
let result = store
.meta(uuid, id)?
.ok_or(UpdateActorError::UnexistingUpdate(id))?;
Ok(result)
})
.await?
}
async fn handle_delete(&self, uuid: Uuid) -> Result<()> {
let store = self.store.clone();
tokio::task::spawn_blocking(move || store.delete_all(uuid)).await??;
Ok(())
}
async fn handle_snapshot(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()> {
let index_handle = self.index_handle.clone();
let update_store = self.store.clone();
tokio::task::spawn_blocking(move || update_store.snapshot(&uuids, &path, index_handle))
.await??;
Ok(())
}
async fn handle_dump(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()> {
let index_handle = self.index_handle.clone();
let update_store = self.store.clone();
tokio::task::spawn_blocking(move || -> Result<()> {
update_store.dump(&uuids, path.to_path_buf(), index_handle)?;
Ok(())
})
.await??;
Ok(())
}
async fn handle_get_info(&self) -> Result<UpdateStoreInfo> {
let update_store = self.store.clone();
let info = tokio::task::spawn_blocking(move || -> Result<UpdateStoreInfo> {
let info = update_store.get_info()?;
Ok(info)
})
.await??;
Ok(info)
}
}

View File

@@ -1,61 +0,0 @@
use std::error::Error;
use meilisearch_error::{Code, ErrorCode};
use crate::index_controller::index_actor::error::IndexActorError;
pub type Result<T> = std::result::Result<T, UpdateActorError>;
#[derive(Debug, thiserror::Error)]
#[allow(clippy::large_enum_variant)]
pub enum UpdateActorError {
#[error("Update {0} not found.")]
UnexistingUpdate(u64),
#[error("Internal error: {0}")]
Internal(Box<dyn Error + Send + Sync + 'static>),
#[error("{0}")]
IndexActor(#[from] IndexActorError),
#[error(
"update store was shut down due to a fatal error, please check your logs for more info."
)]
FatalUpdateStoreError,
#[error("{0}")]
InvalidPayload(Box<dyn Error + Send + Sync + 'static>),
#[error("{0}")]
PayloadError(#[from] actix_web::error::PayloadError),
}
impl<T> From<tokio::sync::mpsc::error::SendError<T>> for UpdateActorError {
fn from(_: tokio::sync::mpsc::error::SendError<T>) -> Self {
Self::FatalUpdateStoreError
}
}
impl From<tokio::sync::oneshot::error::RecvError> for UpdateActorError {
fn from(_: tokio::sync::oneshot::error::RecvError) -> Self {
Self::FatalUpdateStoreError
}
}
internal_error!(
UpdateActorError: heed::Error,
std::io::Error,
serde_json::Error,
tokio::task::JoinError
);
impl ErrorCode for UpdateActorError {
fn error_code(&self) -> Code {
match self {
UpdateActorError::UnexistingUpdate(_) => Code::NotFound,
UpdateActorError::Internal(_) => Code::Internal,
UpdateActorError::IndexActor(e) => e.error_code(),
UpdateActorError::FatalUpdateStoreError => Code::Internal,
UpdateActorError::InvalidPayload(_) => Code::BadRequest,
UpdateActorError::PayloadError(error) => match error {
actix_http::error::PayloadError::Overflow => Code::PayloadTooLarge,
_ => Code::Internal,
},
}
}
}

View File

@@ -1,103 +0,0 @@
use std::collections::HashSet;
use std::path::{Path, PathBuf};
use tokio::sync::{mpsc, oneshot};
use uuid::Uuid;
use crate::index_controller::{IndexActorHandle, UpdateStatus};
use super::error::Result;
use super::{PayloadData, UpdateActor, UpdateActorHandle, UpdateMeta, UpdateMsg, UpdateStoreInfo};
#[derive(Clone)]
pub struct UpdateActorHandleImpl<D> {
sender: mpsc::Sender<UpdateMsg<D>>,
}
impl<D> UpdateActorHandleImpl<D>
where
D: AsRef<[u8]> + Sized + 'static + Sync + Send,
{
pub fn new<I>(
index_handle: I,
path: impl AsRef<Path>,
update_store_size: usize,
) -> anyhow::Result<Self>
where
I: IndexActorHandle + Clone + Send + Sync + 'static,
{
let path = path.as_ref().to_owned();
let (sender, receiver) = mpsc::channel(100);
let actor = UpdateActor::new(update_store_size, receiver, path, index_handle)?;
tokio::task::spawn(actor.run());
Ok(Self { sender })
}
}
#[async_trait::async_trait]
impl<D> UpdateActorHandle for UpdateActorHandleImpl<D>
where
D: AsRef<[u8]> + Sized + 'static + Sync + Send,
{
type Data = D;
async fn get_all_updates_status(&self, uuid: Uuid) -> Result<Vec<UpdateStatus>> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::ListUpdates { uuid, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn update_status(&self, uuid: Uuid, id: u64) -> Result<UpdateStatus> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::GetUpdate { uuid, id, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn delete(&self, uuid: Uuid) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::Delete { uuid, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn snapshot(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::Snapshot { uuids, path, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn dump(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::Dump { uuids, path, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn get_info(&self) -> Result<UpdateStoreInfo> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::GetInfo { ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn update(
&self,
meta: UpdateMeta,
data: mpsc::Receiver<PayloadData<Self::Data>>,
uuid: Uuid,
) -> Result<UpdateStatus> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::Update {
uuid,
data,
meta,
ret,
};
self.sender.send(msg).await?;
receiver.await?
}
}

View File

@@ -1,43 +0,0 @@
use std::collections::HashSet;
use std::path::PathBuf;
use tokio::sync::{mpsc, oneshot};
use uuid::Uuid;
use super::error::Result;
use super::{PayloadData, UpdateMeta, UpdateStatus, UpdateStoreInfo};
pub enum UpdateMsg<D> {
Update {
uuid: Uuid,
meta: UpdateMeta,
data: mpsc::Receiver<PayloadData<D>>,
ret: oneshot::Sender<Result<UpdateStatus>>,
},
ListUpdates {
uuid: Uuid,
ret: oneshot::Sender<Result<Vec<UpdateStatus>>>,
},
GetUpdate {
uuid: Uuid,
ret: oneshot::Sender<Result<UpdateStatus>>,
id: u64,
},
Delete {
uuid: Uuid,
ret: oneshot::Sender<Result<()>>,
},
Snapshot {
uuids: HashSet<Uuid>,
path: PathBuf,
ret: oneshot::Sender<Result<()>>,
},
Dump {
uuids: HashSet<Uuid>,
path: PathBuf,
ret: oneshot::Sender<Result<()>>,
},
GetInfo {
ret: oneshot::Sender<Result<UpdateStoreInfo>>,
},
}

View File

@@ -1,44 +0,0 @@
use std::{collections::HashSet, path::PathBuf};
use actix_http::error::PayloadError;
use tokio::sync::mpsc;
use uuid::Uuid;
use crate::index_controller::{UpdateMeta, UpdateStatus};
use actor::UpdateActor;
use error::Result;
use message::UpdateMsg;
pub use handle_impl::UpdateActorHandleImpl;
pub use store::{UpdateStore, UpdateStoreInfo};
mod actor;
pub mod error;
mod handle_impl;
mod message;
pub mod store;
type PayloadData<D> = std::result::Result<D, PayloadError>;
#[cfg(test)]
use mockall::automock;
#[async_trait::async_trait]
#[cfg_attr(test, automock(type Data=Vec<u8>;))]
pub trait UpdateActorHandle {
type Data: AsRef<[u8]> + Sized + 'static + Sync + Send;
async fn get_all_updates_status(&self, uuid: Uuid) -> Result<Vec<UpdateStatus>>;
async fn update_status(&self, uuid: Uuid, id: u64) -> Result<UpdateStatus>;
async fn delete(&self, uuid: Uuid) -> Result<()>;
async fn snapshot(&self, uuid: HashSet<Uuid>, path: PathBuf) -> Result<()>;
async fn dump(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()>;
async fn get_info(&self) -> Result<UpdateStoreInfo>;
async fn update(
&self,
meta: UpdateMeta,
data: mpsc::Receiver<PayloadData<Self::Data>>,
uuid: Uuid,
) -> Result<UpdateStatus>;
}

View File

@@ -1,86 +0,0 @@
use std::{borrow::Cow, convert::TryInto, mem::size_of};
use heed::{BytesDecode, BytesEncode};
use uuid::Uuid;
pub struct NextIdCodec;
pub enum NextIdKey {
Global,
Index(Uuid),
}
impl<'a> BytesEncode<'a> for NextIdCodec {
type EItem = NextIdKey;
fn bytes_encode(item: &'a Self::EItem) -> Option<Cow<'a, [u8]>> {
match item {
NextIdKey::Global => Some(Cow::Borrowed(b"__global__")),
NextIdKey::Index(ref uuid) => Some(Cow::Borrowed(uuid.as_bytes())),
}
}
}
pub struct PendingKeyCodec;
impl<'a> BytesEncode<'a> for PendingKeyCodec {
type EItem = (u64, Uuid, u64);
fn bytes_encode((global_id, uuid, update_id): &'a Self::EItem) -> Option<Cow<'a, [u8]>> {
let mut bytes = Vec::with_capacity(size_of::<Self::EItem>());
bytes.extend_from_slice(&global_id.to_be_bytes());
bytes.extend_from_slice(uuid.as_bytes());
bytes.extend_from_slice(&update_id.to_be_bytes());
Some(Cow::Owned(bytes))
}
}
impl<'a> BytesDecode<'a> for PendingKeyCodec {
type DItem = (u64, Uuid, u64);
fn bytes_decode(bytes: &'a [u8]) -> Option<Self::DItem> {
let global_id_bytes = bytes.get(0..size_of::<u64>())?.try_into().ok()?;
let global_id = u64::from_be_bytes(global_id_bytes);
let uuid_bytes = bytes
.get(size_of::<u64>()..(size_of::<u64>() + size_of::<Uuid>()))?
.try_into()
.ok()?;
let uuid = Uuid::from_bytes(uuid_bytes);
let update_id_bytes = bytes
.get((size_of::<u64>() + size_of::<Uuid>())..)?
.try_into()
.ok()?;
let update_id = u64::from_be_bytes(update_id_bytes);
Some((global_id, uuid, update_id))
}
}
pub struct UpdateKeyCodec;
impl<'a> BytesEncode<'a> for UpdateKeyCodec {
type EItem = (Uuid, u64);
fn bytes_encode((uuid, update_id): &'a Self::EItem) -> Option<Cow<'a, [u8]>> {
let mut bytes = Vec::with_capacity(size_of::<Self::EItem>());
bytes.extend_from_slice(uuid.as_bytes());
bytes.extend_from_slice(&update_id.to_be_bytes());
Some(Cow::Owned(bytes))
}
}
impl<'a> BytesDecode<'a> for UpdateKeyCodec {
type DItem = (Uuid, u64);
fn bytes_decode(bytes: &'a [u8]) -> Option<Self::DItem> {
let uuid_bytes = bytes.get(0..size_of::<Uuid>())?.try_into().ok()?;
let uuid = Uuid::from_bytes(uuid_bytes);
let update_id_bytes = bytes.get(size_of::<Uuid>()..)?.try_into().ok()?;
let update_id = u64::from_be_bytes(update_id_bytes);
Some((uuid, update_id))
}
}

View File

@@ -1,184 +0,0 @@
use std::{
collections::HashSet,
fs::{create_dir_all, File},
io::{BufRead, BufReader, Write},
path::{Path, PathBuf},
};
use heed::{EnvOpenOptions, RoTxn};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use super::{Result, State, UpdateStore};
use crate::index_controller::{
index_actor::IndexActorHandle, update_actor::store::update_uuid_to_file_path, Enqueued,
UpdateStatus,
};
#[derive(Serialize, Deserialize)]
struct UpdateEntry {
uuid: Uuid,
update: UpdateStatus,
}
impl UpdateStore {
pub fn dump(
&self,
uuids: &HashSet<Uuid>,
path: PathBuf,
handle: impl IndexActorHandle,
) -> Result<()> {
let state_lock = self.state.write();
state_lock.swap(State::Dumping);
// txn must *always* be acquired after state lock, or it will dead lock.
let txn = self.env.write_txn()?;
let dump_path = path.join("updates");
create_dir_all(&dump_path)?;
self.dump_updates(&txn, uuids, &dump_path)?;
let fut = dump_indexes(uuids, handle, &path);
tokio::runtime::Handle::current().block_on(fut)?;
state_lock.swap(State::Idle);
Ok(())
}
fn dump_updates(
&self,
txn: &RoTxn,
uuids: &HashSet<Uuid>,
path: impl AsRef<Path>,
) -> Result<()> {
let dump_data_path = path.as_ref().join("data.jsonl");
let mut dump_data_file = File::create(dump_data_path)?;
let update_files_path = path.as_ref().join(super::UPDATE_DIR);
create_dir_all(&update_files_path)?;
self.dump_pending(txn, uuids, &mut dump_data_file, &path)?;
self.dump_completed(txn, uuids, &mut dump_data_file)?;
Ok(())
}
fn dump_pending(
&self,
txn: &RoTxn,
uuids: &HashSet<Uuid>,
mut file: &mut File,
dst_path: impl AsRef<Path>,
) -> Result<()> {
let pendings = self.pending_queue.iter(txn)?.lazily_decode_data();
for pending in pendings {
let ((_, uuid, _), data) = pending?;
if uuids.contains(&uuid) {
let update = data.decode()?;
if let Some(ref update_uuid) = update.content {
let src = super::update_uuid_to_file_path(&self.path, *update_uuid);
let dst = super::update_uuid_to_file_path(&dst_path, *update_uuid);
std::fs::copy(src, dst)?;
}
let update_json = UpdateEntry {
uuid,
update: update.into(),
};
serde_json::to_writer(&mut file, &update_json)?;
file.write_all(b"\n")?;
}
}
Ok(())
}
fn dump_completed(
&self,
txn: &RoTxn,
uuids: &HashSet<Uuid>,
mut file: &mut File,
) -> Result<()> {
let updates = self.updates.iter(txn)?.lazily_decode_data();
for update in updates {
let ((uuid, _), data) = update?;
if uuids.contains(&uuid) {
let update = data.decode()?;
let update_json = UpdateEntry { uuid, update };
serde_json::to_writer(&mut file, &update_json)?;
file.write_all(b"\n")?;
}
}
Ok(())
}
pub fn load_dump(
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
db_size: usize,
) -> anyhow::Result<()> {
let dst_update_path = dst.as_ref().join("updates/");
create_dir_all(&dst_update_path)?;
let mut options = EnvOpenOptions::new();
options.map_size(db_size as usize);
let (store, _) = UpdateStore::new(options, &dst_update_path)?;
let src_update_path = src.as_ref().join("updates");
let update_data = File::open(&src_update_path.join("data.jsonl"))?;
let mut update_data = BufReader::new(update_data);
std::fs::create_dir_all(dst_update_path.join("update_files/"))?;
let mut wtxn = store.env.write_txn()?;
let mut line = String::new();
loop {
match update_data.read_line(&mut line) {
Ok(0) => break,
Ok(_) => {
let UpdateEntry { uuid, update } = serde_json::from_str(&line)?;
store.register_raw_updates(&mut wtxn, &update, uuid)?;
// Copy ascociated update path if it exists
if let UpdateStatus::Enqueued(Enqueued {
content: Some(uuid),
..
}) = update
{
let src = update_uuid_to_file_path(&src_update_path, uuid);
let dst = update_uuid_to_file_path(&dst_update_path, uuid);
std::fs::copy(src, dst)?;
}
}
_ => break,
}
line.clear();
}
wtxn.commit()?;
Ok(())
}
}
async fn dump_indexes(
uuids: &HashSet<Uuid>,
handle: impl IndexActorHandle,
path: impl AsRef<Path>,
) -> Result<()> {
for uuid in uuids {
handle.dump(*uuid, path.as_ref().to_owned()).await?;
}
Ok(())
}

View File

@@ -1,729 +0,0 @@
mod codec;
pub mod dump;
use std::fs::{copy, create_dir_all, remove_file, File};
use std::path::Path;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::{
collections::{BTreeMap, HashSet},
path::PathBuf,
time::Duration,
};
use arc_swap::ArcSwap;
use futures::StreamExt;
use heed::types::{ByteSlice, OwnedType, SerdeJson};
use heed::zerocopy::U64;
use heed::{CompactionOption, Database, Env, EnvOpenOptions};
use log::error;
use parking_lot::{Mutex, MutexGuard};
use tokio::runtime::Handle;
use tokio::sync::mpsc;
use tokio::sync::mpsc::error::TrySendError;
use tokio::time::timeout;
use uuid::Uuid;
use codec::*;
use super::error::Result;
use super::UpdateMeta;
use crate::helpers::EnvSizer;
use crate::index_controller::{index_actor::CONCURRENT_INDEX_MSG, updates::*, IndexActorHandle};
#[allow(clippy::upper_case_acronyms)]
type BEU64 = U64<heed::byteorder::BE>;
const UPDATE_DIR: &str = "update_files";
pub struct UpdateStoreInfo {
/// Size of the update store in bytes.
pub size: u64,
/// Uuid of the currently processing update if it exists
pub processing: Option<Uuid>,
}
/// A data structure that allows concurrent reads AND exactly one writer.
pub struct StateLock {
lock: Mutex<()>,
data: ArcSwap<State>,
}
pub struct StateLockGuard<'a> {
_lock: MutexGuard<'a, ()>,
state: &'a StateLock,
}
impl StateLockGuard<'_> {
pub fn swap(&self, state: State) -> Arc<State> {
self.state.data.swap(Arc::new(state))
}
}
impl StateLock {
fn from_state(state: State) -> Self {
let lock = Mutex::new(());
let data = ArcSwap::from(Arc::new(state));
Self { lock, data }
}
pub fn read(&self) -> Arc<State> {
self.data.load().clone()
}
pub fn write(&self) -> StateLockGuard {
let _lock = self.lock.lock();
let state = &self;
StateLockGuard { _lock, state }
}
}
#[allow(clippy::large_enum_variant)]
pub enum State {
Idle,
Processing(Uuid, Processing),
Snapshoting,
Dumping,
}
#[derive(Clone)]
pub struct UpdateStore {
pub env: Env,
/// A queue containing the updates to process, ordered by arrival.
/// The key are built as follow:
/// | global_update_id | index_uuid | update_id |
/// | 8-bytes | 16-bytes | 8-bytes |
pending_queue: Database<PendingKeyCodec, SerdeJson<Enqueued>>,
/// Map indexes to the next available update id. If NextIdKey::Global is queried, then the next
/// global update id is returned
next_update_id: Database<NextIdCodec, OwnedType<BEU64>>,
/// Contains all the performed updates meta, be they failed, aborted, or processed.
/// The keys are built as follow:
/// | Uuid | id |
/// | 16-bytes | 8-bytes |
updates: Database<UpdateKeyCodec, SerdeJson<UpdateStatus>>,
/// Indicates the current state of the update store,
state: Arc<StateLock>,
/// Wake up the loop when a new event occurs.
notification_sender: mpsc::Sender<()>,
path: PathBuf,
}
impl UpdateStore {
fn new(
mut options: EnvOpenOptions,
path: impl AsRef<Path>,
) -> anyhow::Result<(Self, mpsc::Receiver<()>)> {
options.max_dbs(5);
let env = options.open(&path)?;
let pending_queue = env.create_database(Some("pending-queue"))?;
let next_update_id = env.create_database(Some("next-update-id"))?;
let updates = env.create_database(Some("updates"))?;
let state = Arc::new(StateLock::from_state(State::Idle));
let (notification_sender, notification_receiver) = mpsc::channel(1);
Ok((
Self {
env,
pending_queue,
next_update_id,
updates,
state,
notification_sender,
path: path.as_ref().to_owned(),
},
notification_receiver,
))
}
pub fn open(
options: EnvOpenOptions,
path: impl AsRef<Path>,
index_handle: impl IndexActorHandle + Clone + Sync + Send + 'static,
must_exit: Arc<AtomicBool>,
) -> anyhow::Result<Arc<Self>> {
let (update_store, mut notification_receiver) = Self::new(options, path)?;
let update_store = Arc::new(update_store);
// Send a first notification to trigger the process.
if let Err(TrySendError::Closed(())) = update_store.notification_sender.try_send(()) {
panic!("Failed to init update store");
}
// We need a weak reference so we can take ownership on the arc later when we
// want to close the index.
let duration = Duration::from_secs(10 * 60); // 10 minutes
let update_store_weak = Arc::downgrade(&update_store);
tokio::task::spawn(async move {
// Block and wait for something to process with a timeout. The timeout
// function returns a Result and we must just unlock the loop on Result.
'outer: while timeout(duration, notification_receiver.recv())
.await
.map_or(true, |o| o.is_some())
{
loop {
match update_store_weak.upgrade() {
Some(update_store) => {
let handler = index_handle.clone();
let res = tokio::task::spawn_blocking(move || {
update_store.process_pending_update(handler)
})
.await
.expect("Fatal error processing update.");
match res {
Ok(Some(_)) => (),
Ok(None) => break,
Err(e) => {
error!("Fatal error while processing an update that requires the update store to shutdown: {}", e);
must_exit.store(true, Ordering::SeqCst);
break 'outer;
}
}
}
// the ownership on the arc has been taken, we need to exit.
None => break 'outer,
}
}
}
error!("Update store loop exited.");
});
Ok(update_store)
}
/// Returns the next global update id and the next update id for a given `index_uuid`.
fn next_update_id(&self, txn: &mut heed::RwTxn, index_uuid: Uuid) -> heed::Result<(u64, u64)> {
let global_id = self
.next_update_id
.get(txn, &NextIdKey::Global)?
.map(U64::get)
.unwrap_or_default();
self.next_update_id
.put(txn, &NextIdKey::Global, &BEU64::new(global_id + 1))?;
let update_id = self.next_update_id_raw(txn, index_uuid)?;
Ok((global_id, update_id))
}
/// Returns the next next update id for a given `index_uuid` without
/// incrementing the global update id. This is useful for the dumps.
fn next_update_id_raw(&self, txn: &mut heed::RwTxn, index_uuid: Uuid) -> heed::Result<u64> {
let update_id = self
.next_update_id
.get(txn, &NextIdKey::Index(index_uuid))?
.map(U64::get)
.unwrap_or_default();
self.next_update_id.put(
txn,
&NextIdKey::Index(index_uuid),
&BEU64::new(update_id + 1),
)?;
Ok(update_id)
}
/// Registers the update content in the pending store and the meta
/// into the pending-meta store. Returns the new unique update id.
pub fn register_update(
&self,
meta: UpdateMeta,
content: Option<Uuid>,
index_uuid: Uuid,
) -> heed::Result<Enqueued> {
let mut txn = self.env.write_txn()?;
let (global_id, update_id) = self.next_update_id(&mut txn, index_uuid)?;
let meta = Enqueued::new(meta, update_id, content);
self.pending_queue
.put(&mut txn, &(global_id, index_uuid, update_id), &meta)?;
txn.commit()?;
if let Err(TrySendError::Closed(())) = self.notification_sender.try_send(()) {
panic!("Update store loop exited");
}
Ok(meta)
}
/// Push already processed update in the UpdateStore without triggering the notification
/// process. This is useful for the dumps.
pub fn register_raw_updates(
&self,
wtxn: &mut heed::RwTxn,
update: &UpdateStatus,
index_uuid: Uuid,
) -> heed::Result<()> {
match update {
UpdateStatus::Enqueued(enqueued) => {
let (global_id, _update_id) = self.next_update_id(wtxn, index_uuid)?;
self.pending_queue.remap_key_type::<PendingKeyCodec>().put(
wtxn,
&(global_id, index_uuid, enqueued.id()),
enqueued,
)?;
}
_ => {
let _update_id = self.next_update_id_raw(wtxn, index_uuid)?;
self.updates.put(wtxn, &(index_uuid, update.id()), update)?;
}
}
Ok(())
}
/// Executes the user provided function on the next pending update (the one with the lowest id).
/// This is asynchronous as it let the user process the update with a read-only txn and
/// only writing the result meta to the processed-meta store *after* it has been processed.
fn process_pending_update(&self, index_handle: impl IndexActorHandle) -> Result<Option<()>> {
// Create a read transaction to be able to retrieve the pending update in order.
let rtxn = self.env.read_txn()?;
let first_meta = self.pending_queue.first(&rtxn)?;
drop(rtxn);
// If there is a pending update we process and only keep
// a reader while processing it, not a writer.
match first_meta {
Some(((global_id, index_uuid, _), mut pending)) => {
let content = pending.content.take();
let processing = pending.processing();
// Acquire the state lock and set the current state to processing.
// txn must *always* be acquired after state lock, or it will dead lock.
let state = self.state.write();
state.swap(State::Processing(index_uuid, processing.clone()));
let result =
self.perform_update(content, processing, index_handle, index_uuid, global_id);
state.swap(State::Idle);
result
}
None => Ok(None),
}
}
fn perform_update(
&self,
content: Option<Uuid>,
processing: Processing,
index_handle: impl IndexActorHandle,
index_uuid: Uuid,
global_id: u64,
) -> Result<Option<()>> {
let content_path = content.map(|uuid| update_uuid_to_file_path(&self.path, uuid));
let update_id = processing.id();
let file = match content_path {
Some(ref path) => {
let file = File::open(path)?;
Some(file)
}
None => None,
};
// Process the pending update using the provided user function.
let handle = Handle::current();
let result =
match handle.block_on(index_handle.update(index_uuid, processing.clone(), file)) {
Ok(result) => result,
Err(e) => Err(processing.fail(e.into())),
};
// Once the pending update have been successfully processed
// we must remove the content from the pending and processing stores and
// write the *new* meta to the processed-meta store and commit.
let mut wtxn = self.env.write_txn()?;
self.pending_queue
.delete(&mut wtxn, &(global_id, index_uuid, update_id))?;
let result = match result {
Ok(res) => res.into(),
Err(res) => res.into(),
};
self.updates
.put(&mut wtxn, &(index_uuid, update_id), &result)?;
wtxn.commit()?;
if let Some(ref path) = content_path {
remove_file(&path)?;
}
Ok(Some(()))
}
/// List the updates for `index_uuid`.
pub fn list(&self, index_uuid: Uuid) -> Result<Vec<UpdateStatus>> {
let mut update_list = BTreeMap::<u64, UpdateStatus>::new();
let txn = self.env.read_txn()?;
let pendings = self.pending_queue.iter(&txn)?.lazily_decode_data();
for entry in pendings {
let ((_, uuid, id), pending) = entry?;
if uuid == index_uuid {
update_list.insert(id, pending.decode()?.into());
}
}
let updates = self
.updates
.remap_key_type::<ByteSlice>()
.prefix_iter(&txn, index_uuid.as_bytes())?;
for entry in updates {
let (_, update) = entry?;
update_list.insert(update.id(), update);
}
// If the currently processing update is from this index, replace the corresponding pending update with this one.
match *self.state.read() {
State::Processing(uuid, ref processing) if uuid == index_uuid => {
update_list.insert(processing.id(), processing.clone().into());
}
_ => (),
}
Ok(update_list.into_iter().map(|(_, v)| v).collect())
}
/// Returns the update associated meta or `None` if the update doesn't exist.
pub fn meta(&self, index_uuid: Uuid, update_id: u64) -> heed::Result<Option<UpdateStatus>> {
// Check if the update is the one currently processing
match *self.state.read() {
State::Processing(uuid, ref processing)
if uuid == index_uuid && processing.id() == update_id =>
{
return Ok(Some(processing.clone().into()));
}
_ => (),
}
let txn = self.env.read_txn()?;
// Else, check if it is in the updates database:
let update = self.updates.get(&txn, &(index_uuid, update_id))?;
if let Some(update) = update {
return Ok(Some(update));
}
// If nothing was found yet, we resolve to iterate over the pending queue.
let pendings = self.pending_queue.iter(&txn)?.lazily_decode_data();
for entry in pendings {
let ((_, uuid, id), pending) = entry?;
if uuid == index_uuid && id == update_id {
return Ok(Some(pending.decode()?.into()));
}
}
// No update was found.
Ok(None)
}
/// Delete all updates for an index from the update store. If the currently processing update
/// is for `index_uuid`, the call will block until the update is terminated.
pub fn delete_all(&self, index_uuid: Uuid) -> Result<()> {
let mut txn = self.env.write_txn()?;
// Contains all the content file paths that we need to be removed if the deletion was successful.
let mut uuids_to_remove = Vec::new();
let mut pendings = self.pending_queue.iter_mut(&mut txn)?.lazily_decode_data();
while let Some(Ok(((_, uuid, _), pending))) = pendings.next() {
if uuid == index_uuid {
let mut pending = pending.decode()?;
if let Some(update_uuid) = pending.content.take() {
uuids_to_remove.push(update_uuid);
}
// Invariant check: we can only delete the current entry when we don't hold
// references to it anymore. This must be done after we have retrieved its content.
unsafe {
pendings.del_current()?;
}
}
}
drop(pendings);
let mut updates = self
.updates
.remap_key_type::<ByteSlice>()
.prefix_iter_mut(&mut txn, index_uuid.as_bytes())?
.lazily_decode_data();
while let Some(_) = updates.next() {
unsafe {
updates.del_current()?;
}
}
drop(updates);
txn.commit()?;
// If the currently processing update is from our index, we wait until it is
// finished before returning. This ensure that no write to the index occurs after we delete it.
if let State::Processing(uuid, _) = *self.state.read() {
if uuid == index_uuid {
// wait for a write lock, do nothing with it.
self.state.write();
}
}
// Finally, remove any outstanding update files. This must be done after waiting for the
// last update to ensure that the update files are not deleted before the update needs
// them.
uuids_to_remove
.iter()
.map(|uuid| update_uuid_to_file_path(&self.path, *uuid))
.for_each(|path| {
let _ = remove_file(path);
});
Ok(())
}
pub fn snapshot(
&self,
uuids: &HashSet<Uuid>,
path: impl AsRef<Path>,
handle: impl IndexActorHandle + Clone,
) -> Result<()> {
let state_lock = self.state.write();
state_lock.swap(State::Snapshoting);
let txn = self.env.write_txn()?;
let update_path = path.as_ref().join("updates");
create_dir_all(&update_path)?;
// acquire write lock to prevent further writes during snapshot
create_dir_all(&update_path)?;
let db_path = update_path.join("data.mdb");
// create db snapshot
self.env.copy_to_path(&db_path, CompactionOption::Enabled)?;
let update_files_path = update_path.join(UPDATE_DIR);
create_dir_all(&update_files_path)?;
let pendings = self.pending_queue.iter(&txn)?.lazily_decode_data();
for entry in pendings {
let ((_, uuid, _), pending) = entry?;
if uuids.contains(&uuid) {
if let Enqueued {
content: Some(uuid),
..
} = pending.decode()?
{
let path = update_uuid_to_file_path(&self.path, uuid);
copy(path, &update_files_path)?;
}
}
}
let path = &path.as_ref().to_path_buf();
let handle = &handle;
// Perform the snapshot of each index concurently. Only a third of the capabilities of
// the index actor at a time not to put too much pressure on the index actor
let mut stream = futures::stream::iter(uuids.iter())
.map(move |uuid| handle.snapshot(*uuid, path.clone()))
.buffer_unordered(CONCURRENT_INDEX_MSG / 3);
Handle::current().block_on(async {
while let Some(res) = stream.next().await {
res?;
}
Ok(()) as Result<()>
})?;
Ok(())
}
pub fn get_info(&self) -> Result<UpdateStoreInfo> {
let mut size = self.env.size();
let txn = self.env.read_txn()?;
for entry in self.pending_queue.iter(&txn)? {
let (_, pending) = entry?;
if let Enqueued {
content: Some(uuid),
..
} = pending
{
let path = update_uuid_to_file_path(&self.path, uuid);
size += File::open(path)?.metadata()?.len();
}
}
let processing = match *self.state.read() {
State::Processing(uuid, _) => Some(uuid),
_ => None,
};
Ok(UpdateStoreInfo { size, processing })
}
}
fn update_uuid_to_file_path(root: impl AsRef<Path>, uuid: Uuid) -> PathBuf {
root.as_ref()
.join(UPDATE_DIR)
.join(format!("update_{}", uuid))
}
#[cfg(test)]
mod test {
use super::*;
use crate::index_controller::{
index_actor::{error::IndexActorError, MockIndexActorHandle},
UpdateResult,
};
use futures::future::ok;
#[actix_rt::test]
async fn test_next_id() {
let dir = tempfile::tempdir_in(".").unwrap();
let mut options = EnvOpenOptions::new();
let handle = Arc::new(MockIndexActorHandle::new());
options.map_size(4096 * 100);
let update_store = UpdateStore::open(
options,
dir.path(),
handle,
Arc::new(AtomicBool::new(false)),
)
.unwrap();
let index1_uuid = Uuid::new_v4();
let index2_uuid = Uuid::new_v4();
let mut txn = update_store.env.write_txn().unwrap();
let ids = update_store.next_update_id(&mut txn, index1_uuid).unwrap();
txn.commit().unwrap();
assert_eq!((0, 0), ids);
let mut txn = update_store.env.write_txn().unwrap();
let ids = update_store.next_update_id(&mut txn, index2_uuid).unwrap();
txn.commit().unwrap();
assert_eq!((1, 0), ids);
let mut txn = update_store.env.write_txn().unwrap();
let ids = update_store.next_update_id(&mut txn, index1_uuid).unwrap();
txn.commit().unwrap();
assert_eq!((2, 1), ids);
}
#[actix_rt::test]
async fn test_register_update() {
let dir = tempfile::tempdir_in(".").unwrap();
let mut options = EnvOpenOptions::new();
let handle = Arc::new(MockIndexActorHandle::new());
options.map_size(4096 * 100);
let update_store = UpdateStore::open(
options,
dir.path(),
handle,
Arc::new(AtomicBool::new(false)),
)
.unwrap();
let meta = UpdateMeta::ClearDocuments;
let uuid = Uuid::new_v4();
let store_clone = update_store.clone();
tokio::task::spawn_blocking(move || {
store_clone.register_update(meta, None, uuid).unwrap();
})
.await
.unwrap();
let txn = update_store.env.read_txn().unwrap();
assert!(update_store
.pending_queue
.get(&txn, &(0, uuid, 0))
.unwrap()
.is_some());
}
#[actix_rt::test]
async fn test_process_update() {
let dir = tempfile::tempdir_in(".").unwrap();
let mut handle = MockIndexActorHandle::new();
handle
.expect_update()
.times(2)
.returning(|_index_uuid, processing, _file| {
if processing.id() == 0 {
Box::pin(ok(Ok(processing.process(UpdateResult::Other))))
} else {
Box::pin(ok(Err(
processing.fail(IndexActorError::ExistingPrimaryKey.into())
)))
}
});
let handle = Arc::new(handle);
let mut options = EnvOpenOptions::new();
options.map_size(4096 * 100);
let store = UpdateStore::open(
options,
dir.path(),
handle.clone(),
Arc::new(AtomicBool::new(false)),
)
.unwrap();
// wait a bit for the event loop exit.
tokio::time::sleep(std::time::Duration::from_millis(50)).await;
let mut txn = store.env.write_txn().unwrap();
let update = Enqueued::new(UpdateMeta::ClearDocuments, 0, None);
let uuid = Uuid::new_v4();
store
.pending_queue
.put(&mut txn, &(0, uuid, 0), &update)
.unwrap();
let update = Enqueued::new(UpdateMeta::ClearDocuments, 1, None);
store
.pending_queue
.put(&mut txn, &(1, uuid, 1), &update)
.unwrap();
txn.commit().unwrap();
// Process the pending, and check that it has been moved to the update databases, and
// removed from the pending database.
let store_clone = store.clone();
tokio::task::spawn_blocking(move || {
store_clone.process_pending_update(handle.clone()).unwrap();
store_clone.process_pending_update(handle).unwrap();
})
.await
.unwrap();
let txn = store.env.read_txn().unwrap();
assert!(store.pending_queue.first(&txn).unwrap().is_none());
let update = store.updates.get(&txn, &(uuid, 0)).unwrap().unwrap();
assert!(matches!(update, UpdateStatus::Processed(_)));
let update = store.updates.get(&txn, &(uuid, 1)).unwrap().unwrap();
assert!(matches!(update, UpdateStatus::Failed(_)));
}
}

View File

@@ -1,233 +0,0 @@
use chrono::{DateTime, Utc};
use milli::update::{DocumentAdditionResult, IndexDocumentsMethod, UpdateFormat};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use crate::{
error::ResponseError,
index::{Settings, Unchecked},
};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum UpdateResult {
DocumentsAddition(DocumentAdditionResult),
DocumentDeletion { deleted: u64 },
Other,
}
#[allow(clippy::large_enum_variant)]
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum UpdateMeta {
DocumentsAddition {
method: IndexDocumentsMethod,
format: UpdateFormat,
primary_key: Option<String>,
},
ClearDocuments,
DeleteDocuments {
ids: Vec<String>,
},
Settings(Settings<Unchecked>),
}
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Enqueued {
pub update_id: u64,
pub meta: UpdateMeta,
pub enqueued_at: DateTime<Utc>,
pub content: Option<Uuid>,
}
impl Enqueued {
pub fn new(meta: UpdateMeta, update_id: u64, content: Option<Uuid>) -> Self {
Self {
enqueued_at: Utc::now(),
meta,
update_id,
content,
}
}
pub fn processing(self) -> Processing {
Processing {
from: self,
started_processing_at: Utc::now(),
}
}
pub fn abort(self) -> Aborted {
Aborted {
from: self,
aborted_at: Utc::now(),
}
}
pub fn meta(&self) -> &UpdateMeta {
&self.meta
}
pub fn id(&self) -> u64 {
self.update_id
}
}
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Processed {
pub success: UpdateResult,
pub processed_at: DateTime<Utc>,
#[serde(flatten)]
pub from: Processing,
}
impl Processed {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Processing {
#[serde(flatten)]
pub from: Enqueued,
pub started_processing_at: DateTime<Utc>,
}
impl Processing {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
pub fn process(self, success: UpdateResult) -> Processed {
Processed {
success,
from: self,
processed_at: Utc::now(),
}
}
pub fn fail(self, error: ResponseError) -> Failed {
Failed {
from: self,
error,
failed_at: Utc::now(),
}
}
}
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Aborted {
#[serde(flatten)]
from: Enqueued,
aborted_at: DateTime<Utc>,
}
impl Aborted {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Failed {
#[serde(flatten)]
pub from: Processing,
pub error: ResponseError,
pub failed_at: DateTime<Utc>,
}
impl Failed {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "status", rename_all = "camelCase")]
pub enum UpdateStatus {
Processing(Processing),
Enqueued(Enqueued),
Processed(Processed),
Aborted(Aborted),
Failed(Failed),
}
impl UpdateStatus {
pub fn id(&self) -> u64 {
match self {
UpdateStatus::Processing(u) => u.id(),
UpdateStatus::Enqueued(u) => u.id(),
UpdateStatus::Processed(u) => u.id(),
UpdateStatus::Aborted(u) => u.id(),
UpdateStatus::Failed(u) => u.id(),
}
}
pub fn meta(&self) -> &UpdateMeta {
match self {
UpdateStatus::Processing(u) => u.meta(),
UpdateStatus::Enqueued(u) => u.meta(),
UpdateStatus::Processed(u) => u.meta(),
UpdateStatus::Aborted(u) => u.meta(),
UpdateStatus::Failed(u) => u.meta(),
}
}
pub fn processed(&self) -> Option<&Processed> {
match self {
UpdateStatus::Processed(p) => Some(p),
_ => None,
}
}
}
impl From<Enqueued> for UpdateStatus {
fn from(other: Enqueued) -> Self {
Self::Enqueued(other)
}
}
impl From<Aborted> for UpdateStatus {
fn from(other: Aborted) -> Self {
Self::Aborted(other)
}
}
impl From<Processed> for UpdateStatus {
fn from(other: Processed) -> Self {
Self::Processed(other)
}
}
impl From<Processing> for UpdateStatus {
fn from(other: Processing) -> Self {
Self::Processing(other)
}
}
impl From<Failed> for UpdateStatus {
fn from(other: Failed) -> Self {
Self::Failed(other)
}
}

View File

@@ -1,98 +0,0 @@
use std::{collections::HashSet, path::PathBuf};
use log::{trace, warn};
use tokio::sync::mpsc;
use uuid::Uuid;
use super::{error::UuidResolverError, Result, UuidResolveMsg, UuidStore};
pub struct UuidResolverActor<S> {
inbox: mpsc::Receiver<UuidResolveMsg>,
store: S,
}
impl<S: UuidStore> UuidResolverActor<S> {
pub fn new(inbox: mpsc::Receiver<UuidResolveMsg>, store: S) -> Self {
Self { inbox, store }
}
pub async fn run(mut self) {
use UuidResolveMsg::*;
trace!("uuid resolver started");
loop {
match self.inbox.recv().await {
Some(Get { uid: name, ret }) => {
let _ = ret.send(self.handle_get(name).await);
}
Some(Delete { uid: name, ret }) => {
let _ = ret.send(self.handle_delete(name).await);
}
Some(List { ret }) => {
let _ = ret.send(self.handle_list().await);
}
Some(Insert { ret, uuid, name }) => {
let _ = ret.send(self.handle_insert(name, uuid).await);
}
Some(SnapshotRequest { path, ret }) => {
let _ = ret.send(self.handle_snapshot(path).await);
}
Some(GetSize { ret }) => {
let _ = ret.send(self.handle_get_size().await);
}
Some(DumpRequest { path, ret }) => {
let _ = ret.send(self.handle_dump(path).await);
}
// all senders have been dropped, need to quit.
None => break,
}
}
warn!("exiting uuid resolver loop");
}
async fn handle_get(&self, uid: String) -> Result<Uuid> {
self.store
.get_uuid(uid.clone())
.await?
.ok_or(UuidResolverError::UnexistingIndex(uid))
}
async fn handle_delete(&self, uid: String) -> Result<Uuid> {
self.store
.delete(uid.clone())
.await?
.ok_or(UuidResolverError::UnexistingIndex(uid))
}
async fn handle_list(&self) -> Result<Vec<(String, Uuid)>> {
let result = self.store.list().await?;
Ok(result)
}
async fn handle_snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
self.store.snapshot(path).await
}
async fn handle_dump(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
self.store.dump(path).await
}
async fn handle_insert(&self, uid: String, uuid: Uuid) -> Result<()> {
if !is_index_uid_valid(&uid) {
return Err(UuidResolverError::BadlyFormatted(uid));
}
self.store.insert(uid, uuid).await?;
Ok(())
}
async fn handle_get_size(&self) -> Result<u64> {
self.store.get_size().await
}
}
fn is_index_uid_valid(uid: &str) -> bool {
uid.chars()
.all(|x| x.is_ascii_alphanumeric() || x == '-' || x == '_')
}

View File

@@ -1,34 +0,0 @@
use meilisearch_error::{Code, ErrorCode};
pub type Result<T> = std::result::Result<T, UuidResolverError>;
#[derive(Debug, thiserror::Error)]
pub enum UuidResolverError {
#[error("Index already exists.")]
NameAlreadyExist,
#[error("Index \"{0}\" not found.")]
UnexistingIndex(String),
#[error("Index must have a valid uid; Index uid can be of type integer or string only composed of alphanumeric characters, hyphens (-) and underscores (_).")]
BadlyFormatted(String),
#[error("Internal error: {0}")]
Internal(Box<dyn std::error::Error + Sync + Send + 'static>),
}
internal_error!(
UuidResolverError: heed::Error,
uuid::Error,
std::io::Error,
tokio::task::JoinError,
serde_json::Error
);
impl ErrorCode for UuidResolverError {
fn error_code(&self) -> Code {
match self {
UuidResolverError::NameAlreadyExist => Code::IndexAlreadyExists,
UuidResolverError::UnexistingIndex(_) => Code::IndexNotFound,
UuidResolverError::BadlyFormatted(_) => Code::InvalidIndexUid,
UuidResolverError::Internal(_) => Code::Internal,
}
}
}

View File

@@ -1,87 +0,0 @@
use std::collections::HashSet;
use std::path::{Path, PathBuf};
use tokio::sync::{mpsc, oneshot};
use uuid::Uuid;
use super::{HeedUuidStore, Result, UuidResolveMsg, UuidResolverActor, UuidResolverHandle};
#[derive(Clone)]
pub struct UuidResolverHandleImpl {
sender: mpsc::Sender<UuidResolveMsg>,
}
impl UuidResolverHandleImpl {
pub fn new(path: impl AsRef<Path>) -> Result<Self> {
let (sender, reveiver) = mpsc::channel(100);
let store = HeedUuidStore::new(path)?;
let actor = UuidResolverActor::new(reveiver, store);
tokio::spawn(actor.run());
Ok(Self { sender })
}
}
#[async_trait::async_trait]
impl UuidResolverHandle for UuidResolverHandleImpl {
async fn get(&self, name: String) -> Result<Uuid> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::Get { uid: name, ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn delete(&self, name: String) -> Result<Uuid> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::Delete { uid: name, ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn list(&self) -> Result<Vec<(String, Uuid)>> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::List { ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn insert(&self, name: String, uuid: Uuid) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::Insert { ret, name, uuid };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::SnapshotRequest { path, ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn get_size(&self) -> Result<u64> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::GetSize { ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::DumpRequest { ret, path };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
}

View File

@@ -1,35 +0,0 @@
mod actor;
pub mod error;
mod handle_impl;
mod message;
pub mod store;
use std::collections::HashSet;
use std::path::PathBuf;
use uuid::Uuid;
use actor::UuidResolverActor;
use error::Result;
use message::UuidResolveMsg;
use store::UuidStore;
#[cfg(test)]
use mockall::automock;
pub use handle_impl::UuidResolverHandleImpl;
pub use store::HeedUuidStore;
const UUID_STORE_SIZE: usize = 1_073_741_824; //1GiB
#[async_trait::async_trait]
#[cfg_attr(test, automock)]
pub trait UuidResolverHandle {
async fn get(&self, name: String) -> Result<Uuid>;
async fn insert(&self, name: String, uuid: Uuid) -> Result<()>;
async fn delete(&self, name: String) -> Result<Uuid>;
async fn list(&self) -> Result<Vec<(String, Uuid)>>;
async fn snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>>;
async fn get_size(&self) -> Result<u64>;
async fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>>;
}

View File

@@ -1,224 +0,0 @@
use std::collections::HashSet;
use std::fs::{create_dir_all, File};
use std::io::{BufRead, BufReader, Write};
use std::path::{Path, PathBuf};
use heed::types::{ByteSlice, Str};
use heed::{CompactionOption, Database, Env, EnvOpenOptions};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use super::{error::UuidResolverError, Result, UUID_STORE_SIZE};
use crate::helpers::EnvSizer;
#[derive(Serialize, Deserialize)]
struct DumpEntry {
uuid: Uuid,
uid: String,
}
const UUIDS_DB_PATH: &str = "index_uuids";
#[async_trait::async_trait]
pub trait UuidStore: Sized {
// Create a new entry for `name`. Return an error if `err` and the entry already exists, return
// the uuid otherwise.
async fn get_uuid(&self, uid: String) -> Result<Option<Uuid>>;
async fn delete(&self, uid: String) -> Result<Option<Uuid>>;
async fn list(&self) -> Result<Vec<(String, Uuid)>>;
async fn insert(&self, name: String, uuid: Uuid) -> Result<()>;
async fn snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>>;
async fn get_size(&self) -> Result<u64>;
async fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>>;
}
#[derive(Clone)]
pub struct HeedUuidStore {
env: Env,
db: Database<Str, ByteSlice>,
}
impl HeedUuidStore {
pub fn new(path: impl AsRef<Path>) -> Result<Self> {
let path = path.as_ref().join(UUIDS_DB_PATH);
create_dir_all(&path)?;
let mut options = EnvOpenOptions::new();
options.map_size(UUID_STORE_SIZE); // 1GB
let env = options.open(path)?;
let db = env.create_database(None)?;
Ok(Self { env, db })
}
pub fn get_uuid(&self, name: String) -> Result<Option<Uuid>> {
let env = self.env.clone();
let db = self.db;
let txn = env.read_txn()?;
match db.get(&txn, &name)? {
Some(uuid) => {
let uuid = Uuid::from_slice(uuid)?;
Ok(Some(uuid))
}
None => Ok(None),
}
}
pub fn delete(&self, uid: String) -> Result<Option<Uuid>> {
let env = self.env.clone();
let db = self.db;
let mut txn = env.write_txn()?;
match db.get(&txn, &uid)? {
Some(uuid) => {
let uuid = Uuid::from_slice(uuid)?;
db.delete(&mut txn, &uid)?;
txn.commit()?;
Ok(Some(uuid))
}
None => Ok(None),
}
}
pub fn list(&self) -> Result<Vec<(String, Uuid)>> {
let env = self.env.clone();
let db = self.db;
let txn = env.read_txn()?;
let mut entries = Vec::new();
for entry in db.iter(&txn)? {
let (name, uuid) = entry?;
let uuid = Uuid::from_slice(uuid)?;
entries.push((name.to_owned(), uuid))
}
Ok(entries)
}
pub fn insert(&self, name: String, uuid: Uuid) -> Result<()> {
let env = self.env.clone();
let db = self.db;
let mut txn = env.write_txn()?;
if db.get(&txn, &name)?.is_some() {
return Err(UuidResolverError::NameAlreadyExist);
}
db.put(&mut txn, &name, uuid.as_bytes())?;
txn.commit()?;
Ok(())
}
pub fn snapshot(&self, mut path: PathBuf) -> Result<HashSet<Uuid>> {
let env = self.env.clone();
let db = self.db;
// Write transaction to acquire a lock on the database.
let txn = env.write_txn()?;
let mut entries = HashSet::new();
for entry in db.iter(&txn)? {
let (_, uuid) = entry?;
let uuid = Uuid::from_slice(uuid)?;
entries.insert(uuid);
}
// only perform snapshot if there are indexes
if !entries.is_empty() {
path.push(UUIDS_DB_PATH);
create_dir_all(&path).unwrap();
path.push("data.mdb");
env.copy_to_path(path, CompactionOption::Enabled)?;
}
Ok(entries)
}
pub fn get_size(&self) -> Result<u64> {
Ok(self.env.size())
}
pub fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let dump_path = path.join(UUIDS_DB_PATH);
create_dir_all(&dump_path)?;
let dump_file_path = dump_path.join("data.jsonl");
let mut dump_file = File::create(&dump_file_path)?;
let mut uuids = HashSet::new();
let txn = self.env.read_txn()?;
for entry in self.db.iter(&txn)? {
let (uid, uuid) = entry?;
let uid = uid.to_string();
let uuid = Uuid::from_slice(uuid)?;
let entry = DumpEntry { uuid, uid };
serde_json::to_writer(&mut dump_file, &entry)?;
dump_file.write_all(b"\n").unwrap();
uuids.insert(uuid);
}
Ok(uuids)
}
pub fn load_dump(src: impl AsRef<Path>, dst: impl AsRef<Path>) -> Result<()> {
let uuid_resolver_path = dst.as_ref().join(UUIDS_DB_PATH);
std::fs::create_dir_all(&uuid_resolver_path)?;
let src_indexes = src.as_ref().join(UUIDS_DB_PATH).join("data.jsonl");
let indexes = File::open(&src_indexes)?;
let mut indexes = BufReader::new(indexes);
let mut line = String::new();
let db = Self::new(dst)?;
let mut txn = db.env.write_txn()?;
loop {
match indexes.read_line(&mut line) {
Ok(0) => break,
Ok(_) => {
let DumpEntry { uuid, uid } = serde_json::from_str(&line)?;
println!("importing {} {}", uid, uuid);
db.db.put(&mut txn, &uid, uuid.as_bytes())?;
}
Err(e) => return Err(e.into()),
}
line.clear();
}
txn.commit()?;
db.env.prepare_for_closing().wait();
Ok(())
}
}
#[async_trait::async_trait]
impl UuidStore for HeedUuidStore {
async fn get_uuid(&self, name: String) -> Result<Option<Uuid>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.get_uuid(name)).await?
}
async fn delete(&self, uid: String) -> Result<Option<Uuid>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.delete(uid)).await?
}
async fn list(&self) -> Result<Vec<(String, Uuid)>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.list()).await?
}
async fn insert(&self, name: String, uuid: Uuid) -> Result<()> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.insert(name, uuid)).await?
}
async fn snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.snapshot(path)).await?
}
async fn get_size(&self) -> Result<u64> {
self.get_size()
}
async fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.dump(path)).await?
}
}

View File

@@ -1,105 +1,92 @@
//! # MeiliSearch
//! Hello there, future contributors. If you are here and see this code, it's probably because you want to add a super new fancy feature in MeiliSearch or fix a bug and first of all, thank you for that!
//!
//! To help you in this task, we'll try to do a little overview of the project.
//! ## Milli
//! [Milli](https://github.com/meilisearch/milli) is the core library of MeiliSearch. It's where we actually index documents and perform searches. Its purpose is to do these two tasks as fast as possible. You can give an update to milli, and it'll uses as many cores as provided to perform it as fast as possible. Nothing more. You can perform searches at the same time (search only uses one core).
//! As you can see, we're missing quite a lot of features here; milli does not handle multiples indexes, it can't queue updates, it doesn't provide any web / API frontend, it doesn't implement dumps or snapshots, etc...
//!
//! ## `Index` module
//! The [index] module is what encapsulates one milli index. It abstracts over its transaction and isolates a task that can be run into a thread. This is the unit of interaction with milli.
//! If you add a feature to milli, you'll probably need to add it in this module too before exposing it to the rest of meilisearch.
//!
//! ## `IndexController` module
//! To handle multiple indexes, we created an [index_controller]. It's in charge of creating new indexes, keeping references to all its indexes, forward asynchronous updates to its indexes, and provide an API to search in its indexes synchronously.
//! To achieves this goal, we use an [actor model](https://en.wikipedia.org/wiki/Actor_model).
//!
//! ### The actor model
//! Every actor is composed of at least three files:
//! - `mod.rs` declare and import all the files used by the actor. We also describe the interface (= all the methods) used to interact with the actor. If you are not modifying anything inside of an actor, this is usually all you need to see.
//! - `handle_impl.rs` implements the interface described in the `mod.rs`; in reality, there is no code logic in this file. Every method is only wrapping its parameters in a structure that is sent to the actor. This is useful for test and futureproofing.
//! - `message.rs` contains an enum that describes all the interactions you can have with the actor.
//! - `actor.rs` is used to create and execute the actor. It's where we'll write the loop looking for new messages and actually perform the tasks.
//!
//! MeiliSearch currently uses four actors:
//! - [`uuid_resolver`](index_controller/uuid_resolver/index.html) hold the association between the user-provided indexes name and the internal [`uuid`](https://en.wikipedia.org/wiki/Universally_unique_identifier) representation we use.
//! - [`index_actor`](index_controller::index_actor) is our representation of multiples indexes. Any request made to MeiliSearch that needs to talk to milli will pass through this actor.
//! - [`update_actor`](index_controller/update_actor/index.html) is in charge of indexes updates. Since updates can take a long time to receive and process, we need to:
//! 1. Store them as fast as possible so we can continue to receive other updates even if nothing has been processed
//! 2. Feed the `index_actor` with a new update every time it finished its current job.
//! - [`dump_actor`](index_controller/dump_actor/index.html) this actor handle the [dumps](https://docs.meilisearch.com/reference/api/dump.html). It needs to contact all the others actors and create a dump of everything that was currently happening.
//!
//! ## Data module
//! The [data] module provide a unified interface to communicate with the index controller and other services (snapshot, dumps, ...), initialize the MeiliSearch instance
//!
//! ## HTTP server
//! To handle the web and API part, we are using [actix-web](https://docs.rs/actix-web/); you can find all routes in the [routes] module.
//! Currently, the configuration of actix-web is made in the [lib.rs](crate).
//! Most of the routes use [extractors] to handle the authentication.
#![allow(rustdoc::private_intra_doc_links)]
pub mod data;
#[macro_use]
pub mod error;
pub mod analytics;
mod task;
#[macro_use]
pub mod extractors;
pub mod helpers;
mod index;
mod index_controller;
pub mod option;
pub mod routes;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
pub mod analytics;
use std::sync::Arc;
use std::time::Duration;
use crate::extractors::authentication::AuthConfig;
pub use self::data::Data;
use crate::error::MeilisearchHttpError;
use actix_web::error::JsonPayloadError;
use analytics::Analytics;
use error::PayloadError;
use http::header::CONTENT_TYPE;
pub use option::Opt;
use actix_web::web;
use actix_web::{web, HttpRequest};
use extractors::authentication::policies::*;
use extractors::payload::PayloadConfig;
use meilisearch_auth::AuthController;
use meilisearch_lib::MeiliSearch;
pub fn configure_data(config: &mut web::ServiceConfig, data: Data) {
let http_payload_size_limit = data.http_payload_size_limit();
pub fn setup_meilisearch(opt: &Opt) -> anyhow::Result<MeiliSearch> {
let mut meilisearch = MeiliSearch::builder();
meilisearch
.set_max_index_size(opt.max_index_size.get_bytes() as usize)
.set_max_task_store_size(opt.max_task_db_size.get_bytes() as usize)
.set_ignore_missing_snapshot(opt.ignore_missing_snapshot)
.set_ignore_snapshot_if_db_exists(opt.ignore_snapshot_if_db_exists)
.set_dump_dst(opt.dumps_dir.clone())
.set_snapshot_interval(Duration::from_secs(opt.snapshot_interval_sec))
.set_snapshot_dir(opt.snapshot_dir.clone());
if let Some(ref path) = opt.import_snapshot {
meilisearch.set_import_snapshot(path.clone());
}
if let Some(ref path) = opt.import_dump {
meilisearch.set_dump_src(path.clone());
}
if opt.schedule_snapshot {
meilisearch.set_schedule_snapshot();
}
meilisearch.build(opt.db_path.clone(), opt.indexer_options.clone())
}
pub fn configure_data(
config: &mut web::ServiceConfig,
data: MeiliSearch,
auth: AuthController,
opt: &Opt,
analytics: Arc<dyn Analytics>,
) {
let http_payload_size_limit = opt.http_payload_size_limit.get_bytes() as usize;
config
.data(data.clone())
.app_data(data)
.app_data(auth)
.app_data(web::Data::from(analytics))
.app_data(
web::JsonConfig::default()
.limit(http_payload_size_limit)
.content_type(|_mime| true) // Accept all mime types
.error_handler(|err, _req| error::payload_error_handler(err).into()),
.content_type(|mime| mime == mime::APPLICATION_JSON)
.error_handler(|err, req: &HttpRequest| match err {
JsonPayloadError::ContentType => match req.headers().get(CONTENT_TYPE) {
Some(content_type) => MeilisearchHttpError::InvalidContentType(
content_type.to_str().unwrap_or("unknown").to_string(),
vec![mime::APPLICATION_JSON.to_string()],
)
.into(),
None => MeilisearchHttpError::MissingContentType(vec![
mime::APPLICATION_JSON.to_string(),
])
.into(),
},
err => PayloadError::from(err).into(),
}),
)
.app_data(PayloadConfig::new(http_payload_size_limit))
.app_data(
web::QueryConfig::default()
.error_handler(|err, _req| error::payload_error_handler(err).into()),
web::QueryConfig::default().error_handler(|err, _req| PayloadError::from(err).into()),
);
}
pub fn configure_auth(config: &mut web::ServiceConfig, data: &Data) {
let keys = data.api_keys();
let auth_config = if let Some(ref master_key) = keys.master {
let private_key = keys.private.as_ref().unwrap();
let public_key = keys.public.as_ref().unwrap();
let mut policies = init_policies!(Public, Private, Admin);
create_users!(
policies,
master_key.as_bytes() => { Admin, Private, Public },
private_key.as_bytes() => { Private, Public },
public_key.as_bytes() => { Public }
);
AuthConfig::Auth(policies)
} else {
AuthConfig::NoAuth
};
config.app_data(auth_config);
}
#[cfg(feature = "mini-dashboard")]
pub fn dashboard(config: &mut web::ServiceConfig, enable_frontend: bool) {
use actix_web::HttpResponse;
@@ -111,7 +98,6 @@ pub fn dashboard(config: &mut web::ServiceConfig, enable_frontend: bool) {
if enable_frontend {
let generated = generated::generate();
let mut scope = web::scope("/");
// Generate routes for mini-dashboard assets
for (path, resource) in generated.into_iter() {
let Resource {
@@ -123,12 +109,11 @@ pub fn dashboard(config: &mut web::ServiceConfig, enable_frontend: bool) {
web::get().to(move || HttpResponse::Ok().content_type(mime_type).body(data)),
));
} else {
scope = scope.service(web::resource(path).route(
config.service(web::resource(path).route(
web::get().to(move || HttpResponse::Ok().content_type(mime_type).body(data)),
));
}
}
config.service(scope);
} else {
config.service(web::resource("/").route(web::get().to(routes::running)));
}
@@ -141,23 +126,24 @@ pub fn dashboard(config: &mut web::ServiceConfig, _enable_frontend: bool) {
#[macro_export]
macro_rules! create_app {
($data:expr, $enable_frontend:expr) => {{
($data:expr, $auth:expr, $enable_frontend:expr, $opt:expr, $analytics:expr) => {{
use actix_cors::Cors;
use actix_web::middleware::TrailingSlash;
use actix_web::App;
use actix_web::{middleware, web};
use meilisearch_error::ResponseError;
use meilisearch_http::error::MeilisearchHttpError;
use meilisearch_http::routes;
use meilisearch_http::{configure_auth, configure_data, dashboard};
use meilisearch_http::{configure_data, dashboard};
App::new()
.configure(|s| configure_data(s, $data.clone()))
.configure(|s| configure_auth(s, &$data))
.configure(|s| configure_data(s, $data.clone(), $auth.clone(), &$opt, $analytics))
.configure(routes::configure)
.configure(|s| dashboard(s, $enable_frontend))
.wrap(
Cors::default()
.send_wildcard()
.allowed_headers(vec!["content-type", "x-meili-api-key"])
.allow_any_header()
.allow_any_origin()
.allow_any_method()
.max_age(86_400), // 24h

View File

@@ -1,21 +1,20 @@
use std::env;
use std::sync::Arc;
use actix_web::HttpServer;
use main_error::MainError;
use meilisearch_http::{create_app, Data, Opt};
use structopt::StructOpt;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
use meilisearch_auth::AuthController;
use meilisearch_http::analytics;
use meilisearch_http::analytics::Analytics;
use meilisearch_http::{create_app, setup_meilisearch, Opt};
use meilisearch_lib::MeiliSearch;
use structopt::StructOpt;
#[cfg(target_os = "linux")]
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
#[actix_web::main]
async fn main() -> Result<(), MainError> {
let opt = Opt::from_args();
static ALLOC: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
/// does all the setup before meilisearch is launched
fn setup(opt: &Opt) -> anyhow::Result<()> {
let mut log_builder = env_logger::Builder::new();
log_builder.parse_filters(&opt.log_level);
if opt.log_level == "info" {
@@ -25,40 +24,66 @@ async fn main() -> Result<(), MainError> {
log_builder.init();
Ok(())
}
#[actix_web::main]
async fn main() -> anyhow::Result<()> {
let opt = Opt::from_args();
setup(&opt)?;
match opt.env.as_ref() {
"production" => {
if opt.master_key.is_none() {
return Err(
anyhow::bail!(
"In production mode, the environment variable MEILI_MASTER_KEY is mandatory"
.into(),
);
)
}
}
"development" => (),
_ => unreachable!(),
}
let data = Data::new(opt.clone())?;
let meilisearch = setup_meilisearch(&opt)?;
let auth_controller = AuthController::new(&opt.db_path, &opt.master_key)?;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
if !opt.no_analytics {
let analytics_data = data.clone();
let analytics_opt = opt.clone();
tokio::task::spawn(analytics::analytics_sender(analytics_data, analytics_opt));
}
let (analytics, user) = if opt.analytics() {
analytics::SegmentAnalytics::new(&opt, &meilisearch).await
} else {
analytics::MockAnalytics::new(&opt)
};
#[cfg(any(debug_assertions, not(feature = "analytics")))]
let (analytics, user) = analytics::MockAnalytics::new(&opt);
print_launch_resume(&opt, &data);
print_launch_resume(&opt, &user);
run_http(data, opt).await?;
run_http(meilisearch, auth_controller, opt, analytics).await?;
Ok(())
}
async fn run_http(data: Data, opt: Opt) -> Result<(), Box<dyn std::error::Error>> {
async fn run_http(
data: MeiliSearch,
auth_controller: AuthController,
opt: Opt,
analytics: Arc<dyn Analytics>,
) -> anyhow::Result<()> {
let _enable_dashboard = &opt.env == "development";
let http_server = HttpServer::new(move || create_app!(data, _enable_dashboard))
// Disable signals allows the server to terminate immediately when a user enter CTRL-C
.disable_signals();
let opt_clone = opt.clone();
let http_server = HttpServer::new(move || {
create_app!(
data,
auth_controller,
_enable_dashboard,
opt_clone,
analytics.clone()
)
})
// Disable signals allows the server to terminate immediately when a user enter CTRL-C
.disable_signals();
if let Some(config) = opt.get_ssl_config()? {
http_server
@@ -66,12 +91,12 @@ async fn run_http(data: Data, opt: Opt) -> Result<(), Box<dyn std::error::Error>
.run()
.await?;
} else {
http_server.bind(opt.http_addr)?.run().await?;
http_server.bind(&opt.http_addr)?.run().await?;
}
Ok(())
}
pub fn print_launch_resume(opt: &Opt, data: &Data) {
pub fn print_launch_resume(opt: &Opt, user: &str) {
let commit_sha = option_env!("VERGEN_GIT_SHA").unwrap_or("unknown");
let commit_date = option_env!("VERGEN_GIT_COMMIT_TIMESTAMP").unwrap_or("unknown");
@@ -100,23 +125,27 @@ pub fn print_launch_resume(opt: &Opt, data: &Data) {
#[cfg(all(not(debug_assertions), feature = "analytics"))]
{
if opt.no_analytics {
eprintln!("Anonymous telemetry:\t\"Disabled\"");
} else {
if opt.analytics() {
eprintln!(
"
Thank you for using MeiliSearch!
We collect anonymized analytics to improve our product and your experience. To learn more, including how to turn off analytics, visit our dedicated documentation page: https://docs.meilisearch.com/learn/what_is_meilisearch/telemetry.html
Anonymous telemetry: \"Enabled\""
Anonymous telemetry:\t\"Enabled\""
);
} else {
eprintln!("Anonymous telemetry:\t\"Disabled\"");
}
}
if !user.is_empty() {
eprintln!("Instance UID:\t\t\"{}\"", user);
}
eprintln!();
if data.api_keys().master.is_some() {
if opt.master_key.is_some() {
eprintln!("A Master Key has been set. Requests to MeiliSearch won't be authorized unless you provide an authentication key.");
} else {
eprintln!("No master key found; The server will accept unidentified requests. \
@@ -126,6 +155,6 @@ Anonymous telemetry: \"Enabled\""
eprintln!();
eprintln!("Documentation:\t\thttps://docs.meilisearch.com");
eprintln!("Source code:\t\thttps://github.com/meilisearch/meilisearch");
eprintln!("Contact:\t\thttps://docs.meilisearch.com/resources/contact.html or bonjour@meilisearch.com");
eprintln!("Contact:\t\thttps://docs.meilisearch.com/resources/contact.html");
eprintln!();
}

View File

@@ -1,71 +1,16 @@
use byte_unit::ByteError;
use std::fmt;
use std::fs;
use std::io::{BufReader, Read};
use std::ops::Deref;
use std::path::PathBuf;
use std::str::FromStr;
use std::sync::Arc;
use std::{error, fs};
use byte_unit::Byte;
use milli::CompressionType;
use meilisearch_lib::options::IndexerOpts;
use rustls::internal::pemfile::{certs, pkcs8_private_keys, rsa_private_keys};
use rustls::{
AllowAnyAnonymousOrAuthenticatedClient, AllowAnyAuthenticatedClient, NoClientAuth,
RootCertStore,
};
use structopt::StructOpt;
use sysinfo::{RefreshKind, System, SystemExt};
#[derive(Debug, Clone, StructOpt)]
pub struct IndexerOpts {
/// The amount of documents to skip before printing
/// a log regarding the indexing advancement.
#[structopt(long, default_value = "100000")] // 100k
pub log_every_n: usize,
/// Grenad max number of chunks in bytes.
#[structopt(long)]
pub max_nb_chunks: Option<usize>,
/// The maximum amount of memory the indexer will use. It defaults to 2/3
/// of the available memory. It is recommended to use something like 80%-90%
/// of the available memory, no more.
///
/// In case the engine is unable to retrieve the available memory the engine will
/// try to use the memory it needs but without real limit, this can lead to
/// Out-Of-Memory issues and it is recommended to specify the amount of memory to use.
#[structopt(long, default_value)]
pub max_memory: MaxMemory,
/// The name of the compression algorithm to use when compressing intermediate
/// Grenad chunks while indexing documents.
///
/// Choosing a fast algorithm will make the indexing faster but may consume more memory.
#[structopt(long, default_value = "snappy", possible_values = &["snappy", "zlib", "lz4", "lz4hc", "zstd"])]
pub chunk_compression_type: CompressionType,
/// The level of compression of the chosen algorithm.
#[structopt(long, requires = "chunk-compression-type")]
pub chunk_compression_level: Option<u32>,
/// Number of parallel jobs for indexing, defaults to # of CPUs.
#[structopt(long)]
pub indexing_jobs: Option<usize>,
}
impl Default for IndexerOpts {
fn default() -> Self {
Self {
log_every_n: 100_000,
max_nb_chunks: None,
max_memory: MaxMemory::default(),
chunk_compression_type: CompressionType::None,
chunk_compression_level: None,
indexing_jobs: None,
}
}
}
const POSSIBLE_ENV: [&str; 2] = ["development", "production"];
@@ -93,15 +38,15 @@ pub struct Opt {
/// Do not send analytics to Meili.
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[structopt(long, env = "MEILI_NO_ANALYTICS")]
pub no_analytics: bool,
pub no_analytics: Option<Option<bool>>,
/// The maximum size, in bytes, of the main lmdb database directory
#[structopt(long, env = "MEILI_MAX_INDEX_SIZE", default_value = "100 GiB")]
pub max_index_size: Byte,
/// The maximum size, in bytes, of the update lmdb database directory
#[structopt(long, env = "MEILI_MAX_UDB_SIZE", default_value = "100 GiB")]
pub max_udb_size: Byte,
#[structopt(long, env = "MEILI_MAX_TASK_DB_SIZE", default_value = "100 GiB")]
pub max_task_db_size: Byte,
/// The maximum size, in bytes, of accepted JSON payloads
#[structopt(long, env = "MEILI_HTTP_PAYLOAD_SIZE_LIMIT", default_value = "100 MB")]
@@ -184,7 +129,17 @@ pub struct Opt {
}
impl Opt {
pub fn get_ssl_config(&self) -> Result<Option<rustls::ServerConfig>, Box<dyn error::Error>> {
/// Wether analytics should be enabled or not.
#[cfg(all(not(debug_assertions), feature = "analytics"))]
pub fn analytics(&self) -> bool {
match self.no_analytics {
None => true,
Some(None) => false,
Some(Some(disabled)) => !disabled,
}
}
pub fn get_ssl_config(&self) -> anyhow::Result<Option<rustls::ServerConfig>> {
if let (Some(cert_path), Some(key_path)) = (&self.ssl_cert_path, &self.ssl_key_path) {
let client_auth = match &self.ssl_auth_path {
Some(auth_path) => {
@@ -210,7 +165,7 @@ impl Opt {
let ocsp = load_ocsp(&self.ssl_ocsp_path)?;
config
.set_single_cert_with_ocsp_and_sct(certs, privkey, ocsp, vec![])
.map_err(|_| "bad certificates/private key")?;
.map_err(|_| anyhow::anyhow!("bad certificates/private key"))?;
if self.ssl_resumption {
config.set_persistence(rustls::ServerSessionMemoryCache::new(256));
@@ -227,82 +182,31 @@ impl Opt {
}
}
/// A type used to detect the max memory available and use 2/3 of it.
#[derive(Debug, Clone, Copy)]
pub struct MaxMemory(Option<Byte>);
impl FromStr for MaxMemory {
type Err = ByteError;
fn from_str(s: &str) -> Result<MaxMemory, ByteError> {
Byte::from_str(s).map(Some).map(MaxMemory)
}
}
impl Default for MaxMemory {
fn default() -> MaxMemory {
MaxMemory(
total_memory_bytes()
.map(|bytes| bytes * 2 / 3)
.map(Byte::from_bytes),
)
}
}
impl fmt::Display for MaxMemory {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self.0 {
Some(memory) => write!(f, "{}", memory.get_appropriate_unit(true)),
None => f.write_str("unknown"),
}
}
}
impl Deref for MaxMemory {
type Target = Option<Byte>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl MaxMemory {
pub fn unlimited() -> Self {
Self(None)
}
}
/// Returns the total amount of bytes available or `None` if this system isn't supported.
fn total_memory_bytes() -> Option<u64> {
if System::IS_SUPPORTED {
let memory_kind = RefreshKind::new().with_memory();
let mut system = System::new_with_specifics(memory_kind);
system.refresh_memory();
Some(system.total_memory() * 1024) // KiB into bytes
} else {
None
}
}
fn load_certs(filename: PathBuf) -> Result<Vec<rustls::Certificate>, Box<dyn error::Error>> {
let certfile = fs::File::open(filename).map_err(|_| "cannot open certificate file")?;
fn load_certs(filename: PathBuf) -> anyhow::Result<Vec<rustls::Certificate>> {
let certfile =
fs::File::open(filename).map_err(|_| anyhow::anyhow!("cannot open certificate file"))?;
let mut reader = BufReader::new(certfile);
Ok(certs(&mut reader).map_err(|_| "cannot read certificate file")?)
certs(&mut reader).map_err(|_| anyhow::anyhow!("cannot read certificate file"))
}
fn load_private_key(filename: PathBuf) -> Result<rustls::PrivateKey, Box<dyn error::Error>> {
fn load_private_key(filename: PathBuf) -> anyhow::Result<rustls::PrivateKey> {
let rsa_keys = {
let keyfile =
fs::File::open(filename.clone()).map_err(|_| "cannot open private key file")?;
let keyfile = fs::File::open(filename.clone())
.map_err(|_| anyhow::anyhow!("cannot open private key file"))?;
let mut reader = BufReader::new(keyfile);
rsa_private_keys(&mut reader).map_err(|_| "file contains invalid rsa private key")?
rsa_private_keys(&mut reader)
.map_err(|_| anyhow::anyhow!("file contains invalid rsa private key"))?
};
let pkcs8_keys = {
let keyfile = fs::File::open(filename).map_err(|_| "cannot open private key file")?;
let keyfile = fs::File::open(filename)
.map_err(|_| anyhow::anyhow!("cannot open private key file"))?;
let mut reader = BufReader::new(keyfile);
pkcs8_private_keys(&mut reader)
.map_err(|_| "file contains invalid pkcs8 private key (encrypted keys not supported)")?
pkcs8_private_keys(&mut reader).map_err(|_| {
anyhow::anyhow!(
"file contains invalid pkcs8 private key (encrypted keys not supported)"
)
})?
};
// prefer to load pkcs8 keys
@@ -314,14 +218,14 @@ fn load_private_key(filename: PathBuf) -> Result<rustls::PrivateKey, Box<dyn err
}
}
fn load_ocsp(filename: &Option<PathBuf>) -> Result<Vec<u8>, Box<dyn error::Error>> {
fn load_ocsp(filename: &Option<PathBuf>) -> anyhow::Result<Vec<u8>> {
let mut ret = Vec::new();
if let Some(ref name) = filename {
fs::File::open(name)
.map_err(|_| "cannot open ocsp file")?
.map_err(|_| anyhow::anyhow!("cannot open ocsp file"))?
.read_to_end(&mut ret)
.map_err(|_| "cannot read oscp file")?;
.map_err(|_| anyhow::anyhow!("cannot read oscp file"))?;
}
Ok(ret)

View File

@@ -0,0 +1,134 @@
use std::str;
use actix_web::{web, HttpRequest, HttpResponse};
use chrono::SecondsFormat;
use meilisearch_auth::{generate_key, Action, AuthController, Key};
use serde::{Deserialize, Serialize};
use serde_json::Value;
use crate::extractors::authentication::{policies::*, GuardedData};
use meilisearch_error::ResponseError;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("")
.route(web::post().to(create_api_key))
.route(web::get().to(list_api_keys)),
)
.service(
web::resource("/{api_key}")
.route(web::get().to(get_api_key))
.route(web::patch().to(patch_api_key))
.route(web::delete().to(delete_api_key)),
);
}
pub async fn create_api_key(
auth_controller: GuardedData<MasterPolicy, AuthController>,
body: web::Json<Value>,
_req: HttpRequest,
) -> Result<HttpResponse, ResponseError> {
let key = auth_controller.create_key(body.into_inner()).await?;
let res = KeyView::from_key(key, auth_controller.get_master_key());
Ok(HttpResponse::Created().json(res))
}
pub async fn list_api_keys(
auth_controller: GuardedData<MasterPolicy, AuthController>,
_req: HttpRequest,
) -> Result<HttpResponse, ResponseError> {
let keys = auth_controller.list_keys().await?;
let res: Vec<_> = keys
.into_iter()
.map(|k| KeyView::from_key(k, auth_controller.get_master_key()))
.collect();
Ok(HttpResponse::Ok().json(KeyListView::from(res)))
}
pub async fn get_api_key(
auth_controller: GuardedData<MasterPolicy, AuthController>,
path: web::Path<AuthParam>,
) -> Result<HttpResponse, ResponseError> {
// keep 8 first characters that are the ID of the API key.
let key = auth_controller.get_key(&path.api_key).await?;
let res = KeyView::from_key(key, auth_controller.get_master_key());
Ok(HttpResponse::Ok().json(res))
}
pub async fn patch_api_key(
auth_controller: GuardedData<MasterPolicy, AuthController>,
body: web::Json<Value>,
path: web::Path<AuthParam>,
) -> Result<HttpResponse, ResponseError> {
let key = auth_controller
// keep 8 first characters that are the ID of the API key.
.update_key(&path.api_key, body.into_inner())
.await?;
let res = KeyView::from_key(key, auth_controller.get_master_key());
Ok(HttpResponse::Ok().json(res))
}
pub async fn delete_api_key(
auth_controller: GuardedData<MasterPolicy, AuthController>,
path: web::Path<AuthParam>,
) -> Result<HttpResponse, ResponseError> {
// keep 8 first characters that are the ID of the API key.
auth_controller.delete_key(&path.api_key).await?;
Ok(HttpResponse::NoContent().finish())
}
#[derive(Deserialize)]
pub struct AuthParam {
api_key: String,
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
struct KeyView {
description: Option<String>,
key: String,
actions: Vec<Action>,
indexes: Vec<String>,
expires_at: Option<String>,
created_at: String,
updated_at: String,
}
impl KeyView {
fn from_key(key: Key, master_key: Option<&String>) -> Self {
let key_id = str::from_utf8(&key.id).unwrap();
let generated_key = match master_key {
Some(master_key) => generate_key(master_key.as_bytes(), key_id),
None => generate_key(&[], key_id),
};
KeyView {
description: key.description,
key: generated_key,
actions: key.actions,
indexes: key.indexes,
expires_at: key
.expires_at
.map(|dt| dt.to_rfc3339_opts(SecondsFormat::Secs, true)),
created_at: key.created_at.to_rfc3339_opts(SecondsFormat::Secs, true),
updated_at: key.updated_at.to_rfc3339_opts(SecondsFormat::Secs, true),
}
}
}
#[derive(Debug, Serialize)]
struct KeyListView {
results: Vec<KeyView>,
}
impl From<Vec<KeyView>> for KeyListView {
fn from(results: Vec<KeyView>) -> Self {
Self { results }
}
}

View File

@@ -1,18 +1,26 @@
use actix_web::{web, HttpResponse};
use actix_web::{web, HttpRequest, HttpResponse};
use log::debug;
use meilisearch_error::ResponseError;
use meilisearch_lib::MeiliSearch;
use serde::{Deserialize, Serialize};
use serde_json::json;
use crate::error::ResponseError;
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::Data;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::post().to(create_dump)))
.service(web::resource("/{dump_uid}/status").route(web::get().to(get_dump_status)));
}
pub async fn create_dump(data: GuardedData<Private, Data>) -> Result<HttpResponse, ResponseError> {
let res = data.create_dump().await?;
pub async fn create_dump(
meilisearch: GuardedData<ActionPolicy<{ actions::DUMPS_CREATE }>, MeiliSearch>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish("Dump Created".to_string(), json!({}), Some(&req));
let res = meilisearch.create_dump().await?;
debug!("returns: {:?}", res);
Ok(HttpResponse::Accepted().json(res))
@@ -30,10 +38,10 @@ struct DumpParam {
}
async fn get_dump_status(
data: GuardedData<Private, Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::DUMPS_GET }>, MeiliSearch>,
path: web::Path<DumpParam>,
) -> Result<HttpResponse, ResponseError> {
let res = data.dump_status(path.dump_uid.clone()).await?;
let res = meilisearch.dump_info(path.dump_uid.clone()).await?;
debug!("returns: {:?}", res);
Ok(HttpResponse::Ok().json(res))

View File

@@ -1,50 +1,64 @@
use actix_web::{web, HttpResponse};
use actix_web::error::PayloadError;
use actix_web::http::header::CONTENT_TYPE;
use actix_web::web::Bytes;
use actix_web::HttpMessage;
use actix_web::{web, HttpRequest, HttpResponse};
use bstr::ByteSlice;
use futures::{Stream, StreamExt};
use log::debug;
use milli::update::{IndexDocumentsMethod, UpdateFormat};
use meilisearch_error::ResponseError;
use meilisearch_lib::index_controller::{DocumentAdditionFormat, Update};
use meilisearch_lib::milli::update::IndexDocumentsMethod;
use meilisearch_lib::MeiliSearch;
use mime::Mime;
use once_cell::sync::Lazy;
use serde::Deserialize;
use serde_json::Value;
use tokio::sync::mpsc;
use crate::error::ResponseError;
use crate::analytics::Analytics;
use crate::error::MeilisearchHttpError;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::extractors::payload::Payload;
use crate::routes::IndexParam;
use crate::Data;
use crate::task::SummarizedTaskView;
const DEFAULT_RETRIEVE_DOCUMENTS_OFFSET: usize = 0;
const DEFAULT_RETRIEVE_DOCUMENTS_LIMIT: usize = 20;
/*
macro_rules! guard_content_type {
($fn_name:ident, $guard_value:literal) => {
fn $fn_name(head: &actix_web::dev::RequestHead) -> bool {
if let Some(content_type) = head.headers.get("Content-Type") {
content_type
.to_str()
.map(|v| v.contains($guard_value))
.unwrap_or(false)
} else {
false
}
static ACCEPTED_CONTENT_TYPE: Lazy<Vec<String>> = Lazy::new(|| {
vec![
"application/json".to_string(),
"application/x-ndjson".to_string(),
"text/csv".to_string(),
]
});
/// This is required because Payload is not Sync nor Send
fn payload_to_stream(mut payload: Payload) -> impl Stream<Item = Result<Bytes, PayloadError>> {
let (snd, recv) = mpsc::channel(1);
tokio::task::spawn_local(async move {
while let Some(data) = payload.next().await {
let _ = snd.send(data).await;
}
};
});
tokio_stream::wrappers::ReceiverStream::new(recv)
}
guard_content_type!(guard_json, "application/json");
*/
fn guard_json(head: &actix_web::dev::RequestHead) -> bool {
if let Some(_content_type) = head.headers.get("Content-Type") {
// CURRENTLY AND FOR THIS RELEASE ONLY WE DECIDED TO INTERPRET ALL CONTENT-TYPES AS JSON
true
/*
content_type
.to_str()
.map(|v| v.contains("application/json"))
.unwrap_or(false)
*/
} else {
// if no content-type is specified we still accept the data as json!
true
/// Extracts the mime type from the content type and return
/// a meilisearch error if anyhthing bad happen.
fn extract_mime_type(req: &HttpRequest) -> Result<Option<Mime>, MeilisearchHttpError> {
match req.mime_type() {
Ok(Some(mime)) => Ok(Some(mime)),
Ok(None) => Ok(None),
Err(_) => match req.headers().get(CONTENT_TYPE) {
Some(content_type) => Err(MeilisearchHttpError::InvalidContentType(
content_type.as_bytes().as_bstr().to_string(),
ACCEPTED_CONTENT_TYPE.clone(),
)),
None => Err(MeilisearchHttpError::MissingContentType(
ACCEPTED_CONTENT_TYPE.clone(),
)),
},
}
}
@@ -58,8 +72,8 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("")
.route(web::get().to(get_all_documents))
.route(web::post().guard(guard_json).to(add_documents))
.route(web::put().guard(guard_json).to(update_documents))
.route(web::post().to(add_documents))
.route(web::put().to(update_documents))
.route(web::delete().to(clear_all_documents)),
)
// this route needs to be before the /documents/{document_id} to match properly
@@ -72,27 +86,30 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
}
pub async fn get_document(
data: GuardedData<Public, Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_GET }>, MeiliSearch>,
path: web::Path<DocumentParam>,
) -> Result<HttpResponse, ResponseError> {
let index = path.index_uid.clone();
let id = path.document_id.clone();
let document = data
.retrieve_document(index, id, None as Option<Vec<String>>)
let document = meilisearch
.document(index, id, None as Option<Vec<String>>)
.await?;
debug!("returns: {:?}", document);
Ok(HttpResponse::Ok().json(document))
}
pub async fn delete_document(
data: GuardedData<Private, Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_DELETE }>, MeiliSearch>,
path: web::Path<DocumentParam>,
) -> Result<HttpResponse, ResponseError> {
let update_status = data
.delete_documents(path.index_uid.clone(), vec![path.document_id.clone()])
.await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
let DocumentParam {
document_id,
index_uid,
} = path.into_inner();
let update = Update::DeleteDocuments(vec![document_id]);
let task: SummarizedTaskView = meilisearch.register_update(index_uid, update).await?.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
#[derive(Deserialize, Debug)]
@@ -104,8 +121,8 @@ pub struct BrowseQuery {
}
pub async fn get_all_documents(
data: GuardedData<Public, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_GET }>, MeiliSearch>,
path: web::Path<String>,
params: web::Query<BrowseQuery>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
@@ -120,9 +137,9 @@ pub async fn get_all_documents(
Some(names)
});
let documents = data
.retrieve_documents(
path.index_uid.clone(),
let documents = meilisearch
.documents(
path.into_inner(),
params.offset.unwrap_or(DEFAULT_RETRIEVE_DOCUMENTS_OFFSET),
params.limit.unwrap_or(DEFAULT_RETRIEVE_DOCUMENTS_LIMIT),
attributes_to_retrieve,
@@ -135,58 +152,121 @@ pub async fn get_all_documents(
#[derive(Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
pub struct UpdateDocumentsQuery {
primary_key: Option<String>,
pub primary_key: Option<String>,
}
/// Route used when the payload type is "application/json"
/// Used to add or replace documents
pub async fn add_documents(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_ADD }>, MeiliSearch>,
path: web::Path<String>,
params: web::Query<UpdateDocumentsQuery>,
body: Payload,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let update_status = data
.add_documents(
path.into_inner().index_uid,
IndexDocumentsMethod::ReplaceDocuments,
UpdateFormat::Json,
body,
params.primary_key.clone(),
)
.await?;
let params = params.into_inner();
let index_uid = path.into_inner();
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
analytics.add_documents(
&params,
meilisearch.get_index(index_uid.clone()).await.is_err(),
&req,
);
let allow_index_creation = meilisearch.filters().allow_index_creation;
let task = document_addition(
extract_mime_type(&req)?,
meilisearch,
index_uid,
params.primary_key,
body,
IndexDocumentsMethod::ReplaceDocuments,
allow_index_creation,
)
.await?;
Ok(HttpResponse::Accepted().json(task))
}
/// Route used when the payload type is "application/json"
/// Used to add or replace documents
pub async fn update_documents(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_ADD }>, MeiliSearch>,
path: web::Path<String>,
params: web::Query<UpdateDocumentsQuery>,
body: Payload,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let update = data
.add_documents(
path.into_inner().index_uid,
IndexDocumentsMethod::UpdateDocuments,
UpdateFormat::Json,
body,
params.primary_key.clone(),
)
.await?;
let index_uid = path.into_inner();
debug!("returns: {:?}", update);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update.id() })))
analytics.update_documents(
&params,
meilisearch.get_index(index_uid.clone()).await.is_err(),
&req,
);
let allow_index_creation = meilisearch.filters().allow_index_creation;
let task = document_addition(
extract_mime_type(&req)?,
meilisearch,
index_uid,
params.into_inner().primary_key,
body,
IndexDocumentsMethod::UpdateDocuments,
allow_index_creation,
)
.await?;
Ok(HttpResponse::Accepted().json(task))
}
async fn document_addition(
mime_type: Option<Mime>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_ADD }>, MeiliSearch>,
index_uid: String,
primary_key: Option<String>,
body: Payload,
method: IndexDocumentsMethod,
allow_index_creation: bool,
) -> Result<SummarizedTaskView, ResponseError> {
let format = match mime_type
.as_ref()
.map(|m| (m.type_().as_str(), m.subtype().as_str()))
{
Some(("application", "json")) => DocumentAdditionFormat::Json,
Some(("application", "x-ndjson")) => DocumentAdditionFormat::Ndjson,
Some(("text", "csv")) => DocumentAdditionFormat::Csv,
Some((type_, subtype)) => {
return Err(MeilisearchHttpError::InvalidContentType(
format!("{}/{}", type_, subtype),
ACCEPTED_CONTENT_TYPE.clone(),
)
.into())
}
None => {
return Err(
MeilisearchHttpError::MissingContentType(ACCEPTED_CONTENT_TYPE.clone()).into(),
)
}
};
let update = Update::DocumentAddition {
payload: Box::new(payload_to_stream(body)),
primary_key,
method,
format,
allow_index_creation,
};
let task = meilisearch.register_update(index_uid, update).await?.into();
debug!("returns: {:?}", task);
Ok(task)
}
pub async fn delete_documents(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_DELETE }>, MeiliSearch>,
path: web::Path<String>,
body: web::Json<Vec<Value>>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
@@ -199,16 +279,26 @@ pub async fn delete_documents(
})
.collect();
let update_status = data.delete_documents(path.index_uid.clone(), ids).await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
let update = Update::DeleteDocuments(ids);
let task: SummarizedTaskView = meilisearch
.register_update(path.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn clear_all_documents(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_DELETE }>, MeiliSearch>,
path: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let update_status = data.clear_documents(path.index_uid.clone()).await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
let update = Update::ClearDocuments;
let task: SummarizedTaskView = meilisearch
.register_update(path.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}

View File

@@ -1,17 +1,20 @@
use actix_web::{web, HttpResponse};
use actix_web::{web, HttpRequest, HttpResponse};
use chrono::{DateTime, Utc};
use log::debug;
use meilisearch_error::ResponseError;
use meilisearch_lib::index_controller::Update;
use meilisearch_lib::MeiliSearch;
use serde::{Deserialize, Serialize};
use serde_json::json;
use crate::error::ResponseError;
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::routes::IndexParam;
use crate::Data;
use crate::task::SummarizedTaskView;
pub mod documents;
pub mod search;
pub mod settings;
pub mod updates;
pub mod tasks;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
@@ -30,13 +33,23 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
.service(web::resource("/stats").route(web::get().to(get_index_stats)))
.service(web::scope("/documents").configure(documents::configure))
.service(web::scope("/search").configure(search::configure))
.service(web::scope("/updates").configure(updates::configure))
.service(web::scope("/tasks").configure(tasks::configure))
.service(web::scope("/settings").configure(settings::configure)),
);
}
pub async fn list_indexes(data: GuardedData<Private, Data>) -> Result<HttpResponse, ResponseError> {
let indexes = data.list_indexes().await?;
pub async fn list_indexes(
data: GuardedData<ActionPolicy<{ actions::INDEXES_GET }>, MeiliSearch>,
) -> Result<HttpResponse, ResponseError> {
let filters = data.filters();
let mut indexes = data.list_indexes().await?;
if let Some(indexes_filter) = filters.indexes.as_ref() {
indexes = indexes
.into_iter()
.filter(|i| indexes_filter.contains(&i.uid))
.collect();
}
debug!("returns: {:?}", indexes);
Ok(HttpResponse::Ok().json(indexes))
}
@@ -49,16 +62,30 @@ pub struct IndexCreateRequest {
}
pub async fn create_index(
data: GuardedData<Private, Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::INDEXES_CREATE }>, MeiliSearch>,
body: web::Json<IndexCreateRequest>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let body = body.into_inner();
let meta = data.create_index(body.uid, body.primary_key).await?;
Ok(HttpResponse::Created().json(meta))
let IndexCreateRequest {
primary_key, uid, ..
} = body.into_inner();
analytics.publish(
"Index Created".to_string(),
json!({ "primary_key": primary_key }),
Some(&req),
);
let update = Update::CreateIndex { primary_key };
let task: SummarizedTaskView = meilisearch.register_update(uid, update).await?.into();
Ok(HttpResponse::Accepted().json(task))
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
#[allow(dead_code)]
pub struct UpdateIndexRequest {
uid: Option<String>,
primary_key: Option<String>,
@@ -75,41 +102,58 @@ pub struct UpdateIndexResponse {
}
pub async fn get_index(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::INDEXES_GET }>, MeiliSearch>,
path: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let meta = data.index(path.index_uid.clone()).await?;
let meta = meilisearch.get_index(path.into_inner()).await?;
debug!("returns: {:?}", meta);
Ok(HttpResponse::Ok().json(meta))
}
pub async fn update_index(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::INDEXES_UPDATE }>, MeiliSearch>,
path: web::Path<String>,
body: web::Json<UpdateIndexRequest>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
let body = body.into_inner();
let meta = data
.update_index(path.into_inner().index_uid, body.primary_key, body.uid)
.await?;
debug!("returns: {:?}", meta);
Ok(HttpResponse::Ok().json(meta))
analytics.publish(
"Index Updated".to_string(),
json!({ "primary_key": body.primary_key}),
Some(&req),
);
let update = Update::UpdateIndex {
primary_key: body.primary_key,
};
let task: SummarizedTaskView = meilisearch
.register_update(path.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn delete_index(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::INDEXES_DELETE }>, MeiliSearch>,
path: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
data.delete_index(path.index_uid.clone()).await?;
Ok(HttpResponse::NoContent().finish())
let uid = path.into_inner();
let update = Update::DeleteIndex;
let task: SummarizedTaskView = meilisearch.register_update(uid, update).await?.into();
Ok(HttpResponse::Accepted().json(task))
}
pub async fn get_index_stats(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::STATS_GET }>, MeiliSearch>,
path: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let response = data.get_index_stats(path.index_uid.clone()).await?;
let response = meilisearch.get_index_stats(path.into_inner()).await?;
debug!("returns: {:?}", response);
Ok(HttpResponse::Ok().json(response))

View File

@@ -1,13 +1,13 @@
use actix_web::{web, HttpResponse};
use actix_web::{web, HttpRequest, HttpResponse};
use log::debug;
use meilisearch_error::ResponseError;
use meilisearch_lib::index::{default_crop_length, SearchQuery, DEFAULT_SEARCH_LIMIT};
use meilisearch_lib::MeiliSearch;
use serde::Deserialize;
use serde_json::Value;
use crate::error::ResponseError;
use crate::analytics::{Analytics, SearchAggregator};
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::index::{default_crop_length, SearchQuery, DEFAULT_SEARCH_LIMIT};
use crate::routes::IndexParam;
use crate::Data;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
@@ -61,9 +61,7 @@ impl From<SearchQueryGet> for SearchQuery {
None => None,
};
let sort = other
.sort
.map(|attrs| attrs.split(',').map(String::from).collect());
let sort = other.sort.map(|attr| fix_sort_query_parameters(&attr));
Self {
q: other.q,
@@ -81,14 +79,51 @@ impl From<SearchQueryGet> for SearchQuery {
}
}
// TODO: TAMO: split on :asc, and :desc, instead of doing some weird things
/// Transform the sort query parameter into something that matches the post expected format.
fn fix_sort_query_parameters(sort_query: &str) -> Vec<String> {
let mut sort_parameters = Vec::new();
let mut merge = false;
for current_sort in sort_query.trim_matches('"').split(',').map(|s| s.trim()) {
if current_sort.starts_with("_geoPoint(") {
sort_parameters.push(current_sort.to_string());
merge = true;
} else if merge && !sort_parameters.is_empty() {
sort_parameters
.last_mut()
.unwrap()
.push_str(&format!(",{}", current_sort));
if current_sort.ends_with("):desc") || current_sort.ends_with("):asc") {
merge = false;
}
} else {
sort_parameters.push(current_sort.to_string());
merge = false;
}
}
sort_parameters
}
pub async fn search_with_url_query(
data: GuardedData<Public, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::SEARCH }>, MeiliSearch>,
path: web::Path<String>,
params: web::Query<SearchQueryGet>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let query = params.into_inner().into();
let search_result = data.search(path.into_inner().index_uid, query).await?;
let query: SearchQuery = params.into_inner().into();
let mut aggregate = SearchAggregator::from_query(&query, &req);
let search_result = meilisearch.search(path.into_inner(), query).await;
if let Ok(ref search_result) = search_result {
aggregate.succeed(search_result);
}
analytics.get_search(aggregate);
let search_result = search_result?;
// Tests that the nb_hits is always set to false
#[cfg(test)]
@@ -99,14 +134,24 @@ pub async fn search_with_url_query(
}
pub async fn search_with_post(
data: GuardedData<Public, Data>,
path: web::Path<IndexParam>,
meilisearch: GuardedData<ActionPolicy<{ actions::SEARCH }>, MeiliSearch>,
path: web::Path<String>,
params: web::Json<SearchQuery>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("search called with params: {:?}", params);
let search_result = data
.search(path.into_inner().index_uid, params.into_inner())
.await?;
let query = params.into_inner();
debug!("search called with params: {:?}", query);
let mut aggregate = SearchAggregator::from_query(&query, &req);
let search_result = meilisearch.search(path.into_inner(), query).await;
if let Ok(ref search_result) = search_result {
aggregate.succeed(search_result);
}
analytics.post_search(aggregate);
let search_result = search_result?;
// Tests that the nb_hits is always set to false
#[cfg(test)]
@@ -115,3 +160,42 @@ pub async fn search_with_post(
debug!("returns: {:?}", search_result);
Ok(HttpResponse::Ok().json(search_result))
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_fix_sort_query_parameters() {
let sort = fix_sort_query_parameters("_geoPoint(12, 13):asc");
assert_eq!(sort, vec!["_geoPoint(12,13):asc".to_string()]);
let sort = fix_sort_query_parameters("doggo:asc,_geoPoint(12.45,13.56):desc");
assert_eq!(
sort,
vec![
"doggo:asc".to_string(),
"_geoPoint(12.45,13.56):desc".to_string(),
]
);
let sort = fix_sort_query_parameters(
"doggo:asc , _geoPoint(12.45, 13.56, 2590352):desc , catto:desc",
);
assert_eq!(
sort,
vec![
"doggo:asc".to_string(),
"_geoPoint(12.45,13.56,2590352):desc".to_string(),
"catto:desc".to_string(),
]
);
let sort = fix_sort_query_parameters("doggo:asc , _geoPoint(1, 2), catto:desc");
// This is ugly but eh, I don't want to write a full parser just for this unused route
assert_eq!(
sort,
vec![
"doggo:asc".to_string(),
"_geoPoint(1,2),catto:desc".to_string(),
]
);
}
}

View File

@@ -1,65 +1,98 @@
use actix_web::{web, HttpResponse};
use log::debug;
use actix_web::{web, HttpRequest, HttpResponse};
use meilisearch_error::ResponseError;
use meilisearch_lib::index::{Settings, Unchecked};
use meilisearch_lib::index_controller::Update;
use meilisearch_lib::MeiliSearch;
use serde_json::json;
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::index::Settings;
use crate::Data;
use crate::{error::ResponseError, index::Unchecked};
use crate::task::SummarizedTaskView;
#[macro_export]
macro_rules! make_setting_route {
($route:literal, $type:ty, $attr:ident, $camelcase_attr:literal) => {
($route:literal, $type:ty, $attr:ident, $camelcase_attr:literal, $analytics_var:ident, $analytics:expr) => {
pub mod $attr {
use actix_web::{web, HttpRequest, HttpResponse, Resource};
use log::debug;
use actix_web::{web, HttpResponse, Resource};
use milli::update::Setting;
use meilisearch_lib::milli::update::Setting;
use meilisearch_lib::{index::Settings, index_controller::Update, MeiliSearch};
use crate::data;
use crate::error::ResponseError;
use crate::index::Settings;
use crate::extractors::authentication::{GuardedData, policies::*};
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::task::SummarizedTaskView;
use meilisearch_error::ResponseError;
pub async fn delete(
data: GuardedData<Private, data::Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, MeiliSearch>,
index_uid: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
use crate::index::Settings;
let settings = Settings {
$attr: Setting::Reset,
..Default::default()
};
let update_status = data.update_settings(index_uid.into_inner(), settings, false).await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
let allow_index_creation = meilisearch.filters().allow_index_creation;
let update = Update::Settings {
settings,
is_deletion: true,
allow_index_creation,
};
let task: SummarizedTaskView = meilisearch
.register_update(index_uid.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn update(
data: GuardedData<Private, data::Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, MeiliSearch>,
index_uid: actix_web::web::Path<String>,
body: actix_web::web::Json<Option<$type>>,
req: HttpRequest,
$analytics_var: web::Data<dyn Analytics>,
) -> std::result::Result<HttpResponse, ResponseError> {
let body = body.into_inner();
$analytics(&body, &req);
let settings = Settings {
$attr: match body.into_inner() {
$attr: match body {
Some(inner_body) => Setting::Set(inner_body),
None => Setting::Reset
None => Setting::Reset,
},
..Default::default()
};
let update_status = data.update_settings(index_uid.into_inner(), settings, true).await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
let allow_index_creation = meilisearch.filters().allow_index_creation;
let update = Update::Settings {
settings,
is_deletion: false,
allow_index_creation,
};
let task: SummarizedTaskView = meilisearch
.register_update(index_uid.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn get(
data: GuardedData<Private, data::Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::SETTINGS_GET }>, MeiliSearch>,
index_uid: actix_web::web::Path<String>,
) -> std::result::Result<HttpResponse, ResponseError> {
let settings = data.settings(index_uid.into_inner()).await?;
let settings = meilisearch.settings(index_uid.into_inner()).await?;
debug!("returns: {:?}", settings);
let mut json = serde_json::json!(&settings);
let val = json[$camelcase_attr].take();
Ok(HttpResponse::Ok().json(val))
}
@@ -71,20 +104,53 @@ macro_rules! make_setting_route {
}
}
};
($route:literal, $type:ty, $attr:ident, $camelcase_attr:literal) => {
make_setting_route!($route, $type, $attr, $camelcase_attr, _analytics, |_, _| {});
};
}
make_setting_route!(
"/filterable-attributes",
std::collections::BTreeSet<String>,
filterable_attributes,
"filterableAttributes"
"filterableAttributes",
analytics,
|setting: &Option<std::collections::BTreeSet<String>>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"FilterableAttributes Updated".to_string(),
json!({
"filterable_attributes": {
"total": setting.as_ref().map(|filter| filter.len()).unwrap_or(0),
"has_geo": setting.as_ref().map(|filter| filter.contains("_geo")).unwrap_or(false),
}
}),
Some(req),
);
}
);
make_setting_route!(
"/sortable-attributes",
std::collections::BTreeSet<String>,
sortable_attributes,
"sortableAttributes"
"sortableAttributes",
analytics,
|setting: &Option<std::collections::BTreeSet<String>>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"SortableAttributes Updated".to_string(),
json!({
"sortable_attributes": {
"total": setting.as_ref().map(|sort| sort.len()).unwrap_or(0),
"has_geo": setting.as_ref().map(|sort| sort.contains("_geo")).unwrap_or(false),
},
}),
Some(req),
);
}
);
make_setting_route!(
@@ -98,7 +164,21 @@ make_setting_route!(
"/searchable-attributes",
Vec<String>,
searchable_attributes,
"searchableAttributes"
"searchableAttributes",
analytics,
|setting: &Option<Vec<String>>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"SearchableAttributes Updated".to_string(),
json!({
"searchable_attributes": {
"total": setting.as_ref().map(|searchable| searchable.len()).unwrap_or(0),
},
}),
Some(req),
);
}
);
make_setting_route!(
@@ -122,7 +202,26 @@ make_setting_route!(
"distinctAttribute"
);
make_setting_route!("/ranking-rules", Vec<String>, ranking_rules, "rankingRules");
make_setting_route!(
"/ranking-rules",
Vec<String>,
ranking_rules,
"rankingRules",
analytics,
|setting: &Option<Vec<String>>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"RankingRules Updated".to_string(),
json!({
"ranking_rules": {
"sort_position": setting.as_ref().map(|sort| sort.iter().position(|s| s == "sort")),
}
}),
Some(req),
);
}
);
macro_rules! generate_configure {
($($mod:ident),*) => {
@@ -149,21 +248,52 @@ generate_configure!(
);
pub async fn update_all(
data: GuardedData<Private, Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, MeiliSearch>,
index_uid: web::Path<String>,
body: web::Json<Settings<Unchecked>>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let settings = body.into_inner().check();
let update_result = data
.update_settings(index_uid.into_inner(), settings, true)
.await?;
let json = serde_json::json!({ "updateId": update_result.id() });
debug!("returns: {:?}", json);
Ok(HttpResponse::Accepted().json(json))
let settings = body.into_inner();
analytics.publish(
"Settings Updated".to_string(),
json!({
"ranking_rules": {
"sort_position": settings.ranking_rules.as_ref().set().map(|sort| sort.iter().position(|s| s == "sort")),
},
"searchable_attributes": {
"total": settings.searchable_attributes.as_ref().set().map(|searchable| searchable.len()).unwrap_or(0),
},
"sortable_attributes": {
"total": settings.sortable_attributes.as_ref().set().map(|sort| sort.len()).unwrap_or(0),
"has_geo": settings.sortable_attributes.as_ref().set().map(|sort| sort.iter().any(|s| s == "_geo")).unwrap_or(false),
},
"filterable_attributes": {
"total": settings.filterable_attributes.as_ref().set().map(|filter| filter.len()).unwrap_or(0),
"has_geo": settings.filterable_attributes.as_ref().set().map(|filter| filter.iter().any(|s| s == "_geo")).unwrap_or(false),
},
}),
Some(&req),
);
let allow_index_creation = meilisearch.filters().allow_index_creation;
let update = Update::Settings {
settings,
is_deletion: false,
allow_index_creation,
};
let task: SummarizedTaskView = meilisearch
.register_update(index_uid.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn get_all(
data: GuardedData<Private, Data>,
data: GuardedData<ActionPolicy<{ actions::SETTINGS_GET }>, MeiliSearch>,
index_uid: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let settings = data.settings(index_uid.into_inner()).await?;
@@ -172,14 +302,22 @@ pub async fn get_all(
}
pub async fn delete_all(
data: GuardedData<Private, Data>,
data: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, MeiliSearch>,
index_uid: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let settings = Settings::cleared();
let update_result = data
.update_settings(index_uid.into_inner(), settings, false)
.await?;
let json = serde_json::json!({ "updateId": update_result.id() });
debug!("returns: {:?}", json);
Ok(HttpResponse::Accepted().json(json))
let settings = Settings::cleared().into_unchecked();
let allow_index_creation = data.filters().allow_index_creation;
let update = Update::Settings {
settings,
is_deletion: true,
allow_index_creation,
};
let task: SummarizedTaskView = data
.register_update(index_uid.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}

View File

@@ -0,0 +1,76 @@
use actix_web::{web, HttpRequest, HttpResponse};
use chrono::{DateTime, Utc};
use log::debug;
use meilisearch_error::ResponseError;
use meilisearch_lib::MeiliSearch;
use serde::{Deserialize, Serialize};
use serde_json::json;
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::task::{TaskListView, TaskView};
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::get().to(get_all_tasks_status)))
.service(web::resource("{task_id}").route(web::get().to(get_task_status)));
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct UpdateIndexResponse {
name: String,
uid: String,
created_at: DateTime<Utc>,
updated_at: DateTime<Utc>,
primary_key: Option<String>,
}
#[derive(Deserialize)]
pub struct UpdateParam {
index_uid: String,
task_id: u64,
}
pub async fn get_task_status(
meilisearch: GuardedData<ActionPolicy<{ actions::TASKS_GET }>, MeiliSearch>,
index_uid: web::Path<UpdateParam>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish(
"Index Tasks Seen".to_string(),
json!({ "per_task_uid": true }),
Some(&req),
);
let UpdateParam { index_uid, task_id } = index_uid.into_inner();
let task: TaskView = meilisearch.get_index_task(index_uid, task_id).await?.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Ok().json(task))
}
pub async fn get_all_tasks_status(
meilisearch: GuardedData<ActionPolicy<{ actions::TASKS_GET }>, MeiliSearch>,
index_uid: web::Path<String>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish(
"Index Tasks Seen".to_string(),
json!({ "per_task_uid": false }),
Some(&req),
);
let tasks: TaskListView = meilisearch
.list_index_task(index_uid.into_inner(), None, None)
.await?
.into_iter()
.map(TaskView::from)
.collect::<Vec<_>>()
.into();
debug!("returns: {:?}", tasks);
Ok(HttpResponse::Ok().json(tasks))
}

View File

@@ -1,64 +0,0 @@
use actix_web::{web, HttpResponse};
use chrono::{DateTime, Utc};
use log::debug;
use serde::{Deserialize, Serialize};
use crate::error::ResponseError;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::routes::{IndexParam, UpdateStatusResponse};
use crate::Data;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::get().to(get_all_updates_status)))
.service(web::resource("{update_id}").route(web::get().to(get_update_status)));
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
struct UpdateIndexRequest {
uid: Option<String>,
primary_key: Option<String>,
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct UpdateIndexResponse {
name: String,
uid: String,
created_at: DateTime<Utc>,
updated_at: DateTime<Utc>,
primary_key: Option<String>,
}
#[derive(Deserialize)]
pub struct UpdateParam {
index_uid: String,
update_id: u64,
}
pub async fn get_update_status(
data: GuardedData<Private, Data>,
path: web::Path<UpdateParam>,
) -> Result<HttpResponse, ResponseError> {
let params = path.into_inner();
let meta = data
.get_update_status(params.index_uid, params.update_id)
.await?;
let meta = UpdateStatusResponse::from(meta);
debug!("returns: {:?}", meta);
Ok(HttpResponse::Ok().json(meta))
}
pub async fn get_all_updates_status(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
) -> Result<HttpResponse, ResponseError> {
let metas = data.get_updates_status(path.into_inner().index_uid).await?;
let metas = metas
.into_iter()
.map(UpdateStatusResponse::from)
.collect::<Vec<_>>();
debug!("returns: {:?}", metas);
Ok(HttpResponse::Ok().json(metas))
}

View File

@@ -1,23 +1,24 @@
use std::time::Duration;
use actix_web::{web, HttpResponse};
use chrono::{DateTime, Utc};
use log::debug;
use serde::{Deserialize, Serialize};
use crate::error::ResponseError;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::index::{Settings, Unchecked};
use crate::index_controller::{UpdateMeta, UpdateResult, UpdateStatus};
use crate::Data;
use meilisearch_error::ResponseError;
use meilisearch_lib::index::{Settings, Unchecked};
use meilisearch_lib::MeiliSearch;
use crate::extractors::authentication::{policies::*, GuardedData};
mod api_key;
mod dump;
mod indexes;
pub mod indexes;
mod tasks;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("/health").route(web::get().to(get_health)))
cfg.service(web::scope("/tasks").configure(tasks::configure))
.service(web::resource("/health").route(web::get().to(get_health)))
.service(web::scope("/keys").configure(api_key::configure))
.service(web::scope("/dumps").configure(dump::configure))
.service(web::resource("/keys").route(web::get().to(list_keys)))
.service(web::resource("/stats").route(web::get().to(get_stats)))
.service(web::resource("/version").route(web::get().to(get_version)))
.service(web::scope("/indexes").configure(indexes::configure));
@@ -46,38 +47,6 @@ pub enum UpdateType {
},
}
impl From<&UpdateStatus> for UpdateType {
fn from(other: &UpdateStatus) -> Self {
use milli::update::IndexDocumentsMethod::*;
match other.meta() {
UpdateMeta::DocumentsAddition { method, .. } => {
let number = match other {
UpdateStatus::Processed(processed) => match processed.success {
UpdateResult::DocumentsAddition(ref addition) => {
Some(addition.nb_documents)
}
_ => None,
},
_ => None,
};
match method {
ReplaceDocuments => UpdateType::DocumentsAddition { number },
UpdateDocuments => UpdateType::DocumentsPartial { number },
_ => unreachable!(),
}
}
UpdateMeta::ClearDocuments => UpdateType::ClearAll,
UpdateMeta::DeleteDocuments { ids } => UpdateType::DocumentsDeletion {
number: Some(ids.len()),
},
UpdateMeta::Settings(settings) => UpdateType::Settings {
settings: settings.clone(),
},
}
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct ProcessedUpdateResult {
@@ -95,8 +64,7 @@ pub struct FailedUpdateResult {
pub update_id: u64,
#[serde(rename = "type")]
pub update_type: UpdateType,
#[serde(flatten)]
pub response: ResponseError,
pub error: ResponseError,
pub duration: f64, // in seconds
pub enqueued_at: DateTime<Utc>,
pub processed_at: DateTime<Utc>,
@@ -134,79 +102,6 @@ pub enum UpdateStatusResponse {
},
}
impl From<UpdateStatus> for UpdateStatusResponse {
fn from(other: UpdateStatus) -> Self {
let update_type = UpdateType::from(&other);
match other {
UpdateStatus::Processing(processing) => {
let content = EnqueuedUpdateResult {
update_id: processing.id(),
update_type,
enqueued_at: processing.from.enqueued_at,
started_processing_at: Some(processing.started_processing_at),
};
UpdateStatusResponse::Processing { content }
}
UpdateStatus::Enqueued(enqueued) => {
let content = EnqueuedUpdateResult {
update_id: enqueued.id(),
update_type,
enqueued_at: enqueued.enqueued_at,
started_processing_at: None,
};
UpdateStatusResponse::Enqueued { content }
}
UpdateStatus::Processed(processed) => {
let duration = processed
.processed_at
.signed_duration_since(processed.from.started_processing_at)
.num_milliseconds();
// necessary since chrono::duration don't expose a f64 secs method.
let duration = Duration::from_millis(duration as u64).as_secs_f64();
let content = ProcessedUpdateResult {
update_id: processed.id(),
update_type,
duration,
enqueued_at: processed.from.from.enqueued_at,
processed_at: processed.processed_at,
};
UpdateStatusResponse::Processed { content }
}
UpdateStatus::Aborted(_) => unreachable!(),
UpdateStatus::Failed(failed) => {
let duration = failed
.failed_at
.signed_duration_since(failed.from.started_processing_at)
.num_milliseconds();
// necessary since chrono::duration don't expose a f64 secs method.
let duration = Duration::from_millis(duration as u64).as_secs_f64();
let update_id = failed.id();
let response = failed.error;
let content = FailedUpdateResult {
update_id,
update_type,
response,
duration,
enqueued_at: failed.from.from.enqueued_at,
processed_at: failed.failed_at,
};
UpdateStatusResponse::Failed { content }
}
}
}
}
#[derive(Deserialize)]
pub struct IndexParam {
index_uid: String,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
pub struct IndexUpdateResponse {
@@ -222,15 +117,19 @@ impl IndexUpdateResponse {
/// Always return a 200 with:
/// ```json
/// {
/// "status": "Meilisearch is running"
/// "status": "MeiliSearch is running"
/// }
/// ```
pub async fn running() -> HttpResponse {
HttpResponse::Ok().json(serde_json::json!({ "status": "MeiliSearch is running" }))
}
async fn get_stats(data: GuardedData<Private, Data>) -> Result<HttpResponse, ResponseError> {
let response = data.get_all_stats().await?;
async fn get_stats(
meilisearch: GuardedData<ActionPolicy<{ actions::STATS_GET }>, MeiliSearch>,
) -> Result<HttpResponse, ResponseError> {
let filters = meilisearch.filters();
let response = meilisearch.get_all_stats(&filters.indexes).await?;
debug!("returns: {:?}", response);
Ok(HttpResponse::Ok().json(response))
@@ -244,7 +143,9 @@ struct VersionResponse {
pkg_version: String,
}
async fn get_version(_data: GuardedData<Private, Data>) -> HttpResponse {
async fn get_version(
_meilisearch: GuardedData<ActionPolicy<{ actions::VERSION }>, MeiliSearch>,
) -> HttpResponse {
let commit_sha = option_env!("VERGEN_GIT_SHA").unwrap_or("unknown");
let commit_date = option_env!("VERGEN_GIT_COMMIT_TIMESTAMP").unwrap_or("unknown");
@@ -261,108 +162,6 @@ struct KeysResponse {
public: Option<String>,
}
pub async fn list_keys(data: GuardedData<Admin, Data>) -> HttpResponse {
let api_keys = data.api_keys.clone();
HttpResponse::Ok().json(&KeysResponse {
private: api_keys.private,
public: api_keys.public,
})
}
pub async fn get_health() -> Result<HttpResponse, ResponseError> {
Ok(HttpResponse::Ok().json(serde_json::json!({ "status": "available" })))
}
#[cfg(test)]
mod test {
use super::*;
use crate::data::Data;
use crate::extractors::authentication::GuardedData;
/// A type implemented for a route that uses a authentication policy `Policy`.
///
/// This trait is used for regression testing of route authenticaton policies.
trait Is<Policy, T> {}
macro_rules! impl_is_policy {
($($param:ident)*) => {
impl<Policy, Func, $($param,)* Res> Is<Policy, (($($param,)*), Res)> for Func
where Func: Fn(GuardedData<Policy, Data>, $($param,)*) -> Res {}
};
}
impl_is_policy! {}
impl_is_policy! {A}
impl_is_policy! {A B}
impl_is_policy! {A B C}
impl_is_policy! {A B C D}
/// Emits a compile error if a route doesn't have the correct authentication policy.
///
/// This works by trying to cast the route function into a Is<Policy, _> type, where Policy it
/// the authentication policy defined for the route.
macro_rules! test_auth_routes {
($($policy:ident => { $($route:expr,)*})*) => {
#[test]
fn test_auth() {
$($(let _: &dyn Is<$policy, _> = &$route;)*)*
}
};
}
test_auth_routes! {
Public => {
indexes::search::search_with_url_query,
indexes::search::search_with_post,
indexes::documents::get_document,
indexes::documents::get_all_documents,
}
Private => {
get_stats,
get_version,
indexes::create_index,
indexes::list_indexes,
indexes::get_index_stats,
indexes::delete_index,
indexes::update_index,
indexes::get_index,
dump::create_dump,
indexes::settings::filterable_attributes::get,
indexes::settings::displayed_attributes::get,
indexes::settings::searchable_attributes::get,
indexes::settings::stop_words::get,
indexes::settings::synonyms::get,
indexes::settings::distinct_attribute::get,
indexes::settings::filterable_attributes::update,
indexes::settings::displayed_attributes::update,
indexes::settings::searchable_attributes::update,
indexes::settings::stop_words::update,
indexes::settings::synonyms::update,
indexes::settings::distinct_attribute::update,
indexes::settings::filterable_attributes::delete,
indexes::settings::displayed_attributes::delete,
indexes::settings::searchable_attributes::delete,
indexes::settings::stop_words::delete,
indexes::settings::synonyms::delete,
indexes::settings::distinct_attribute::delete,
indexes::settings::delete_all,
indexes::settings::get_all,
indexes::settings::update_all,
indexes::documents::clear_all_documents,
indexes::documents::delete_documents,
indexes::documents::update_documents,
indexes::documents::add_documents,
indexes::documents::delete_document,
indexes::updates::get_all_updates_status,
indexes::updates::get_update_status,
}
Admin => { list_keys, }
}
}

View File

@@ -0,0 +1,73 @@
use actix_web::{web, HttpRequest, HttpResponse};
use meilisearch_error::ResponseError;
use meilisearch_lib::tasks::task::TaskId;
use meilisearch_lib::tasks::TaskFilter;
use meilisearch_lib::MeiliSearch;
use serde_json::json;
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::task::{TaskListView, TaskView};
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::get().to(get_tasks)))
.service(web::resource("/{task_id}").route(web::get().to(get_task)));
}
async fn get_tasks(
meilisearch: GuardedData<ActionPolicy<{ actions::TASKS_GET }>, MeiliSearch>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish(
"Tasks Seen".to_string(),
json!({ "per_task_uid": false }),
Some(&req),
);
let filters = meilisearch.filters().indexes.as_ref().map(|indexes| {
let mut filters = TaskFilter::default();
for index in indexes {
filters.filter_index(index.to_string());
}
filters
});
let tasks: TaskListView = meilisearch
.list_tasks(filters, None, None)
.await?
.into_iter()
.map(TaskView::from)
.collect::<Vec<_>>()
.into();
Ok(HttpResponse::Ok().json(tasks))
}
async fn get_task(
meilisearch: GuardedData<ActionPolicy<{ actions::TASKS_GET }>, MeiliSearch>,
task_id: web::Path<TaskId>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish(
"Tasks Seen".to_string(),
json!({ "per_task_uid": true }),
Some(&req),
);
let filters = meilisearch.filters().indexes.as_ref().map(|indexes| {
let mut filters = TaskFilter::default();
for index in indexes {
filters.filter_index(index.to_string());
}
filters
});
let task: TaskView = meilisearch
.get_task(task_id.into_inner(), filters)
.await?
.into();
Ok(HttpResponse::Ok().json(task))
}

View File

@@ -0,0 +1,313 @@
use chrono::{DateTime, Duration, Utc};
use meilisearch_error::ResponseError;
use meilisearch_lib::index::{Settings, Unchecked};
use meilisearch_lib::milli::update::IndexDocumentsMethod;
use meilisearch_lib::tasks::task::{
DocumentDeletion, Task, TaskContent, TaskEvent, TaskId, TaskResult,
};
use serde::{Serialize, Serializer};
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
enum TaskType {
IndexCreation,
IndexUpdate,
IndexDeletion,
DocumentAddition,
DocumentPartial,
DocumentDeletion,
SettingsUpdate,
ClearAll,
}
impl From<TaskContent> for TaskType {
fn from(other: TaskContent) -> Self {
match other {
TaskContent::DocumentAddition {
merge_strategy: IndexDocumentsMethod::ReplaceDocuments,
..
} => TaskType::DocumentAddition,
TaskContent::DocumentAddition {
merge_strategy: IndexDocumentsMethod::UpdateDocuments,
..
} => TaskType::DocumentPartial,
TaskContent::DocumentDeletion(DocumentDeletion::Clear) => TaskType::ClearAll,
TaskContent::DocumentDeletion(DocumentDeletion::Ids(_)) => TaskType::DocumentDeletion,
TaskContent::SettingsUpdate { .. } => TaskType::SettingsUpdate,
TaskContent::IndexDeletion => TaskType::IndexDeletion,
TaskContent::IndexCreation { .. } => TaskType::IndexCreation,
TaskContent::IndexUpdate { .. } => TaskType::IndexUpdate,
_ => unreachable!("unexpected task type"),
}
}
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
enum TaskStatus {
Enqueued,
Processing,
Succeeded,
Failed,
}
#[derive(Debug, Serialize)]
#[serde(untagged)]
#[allow(clippy::large_enum_variant)]
enum TaskDetails {
#[serde(rename_all = "camelCase")]
DocumentAddition {
received_documents: usize,
indexed_documents: Option<u64>,
},
#[serde(rename_all = "camelCase")]
Settings {
#[serde(flatten)]
settings: Settings<Unchecked>,
},
#[serde(rename_all = "camelCase")]
IndexInfo { primary_key: Option<String> },
#[serde(rename_all = "camelCase")]
DocumentDeletion {
received_document_ids: usize,
deleted_documents: Option<u64>,
},
#[serde(rename_all = "camelCase")]
ClearAll { deleted_documents: Option<u64> },
}
fn serialize_duration<S: Serializer>(
duration: &Option<Duration>,
serializer: S,
) -> Result<S::Ok, S::Error> {
match duration {
Some(duration) => {
let duration_str = duration.to_string();
serializer.serialize_str(&duration_str)
}
None => serializer.serialize_none(),
}
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct TaskView {
uid: TaskId,
index_uid: String,
status: TaskStatus,
#[serde(rename = "type")]
task_type: TaskType,
#[serde(skip_serializing_if = "Option::is_none")]
details: Option<TaskDetails>,
#[serde(skip_serializing_if = "Option::is_none")]
error: Option<ResponseError>,
#[serde(serialize_with = "serialize_duration")]
duration: Option<Duration>,
enqueued_at: DateTime<Utc>,
started_at: Option<DateTime<Utc>>,
finished_at: Option<DateTime<Utc>>,
}
impl From<Task> for TaskView {
fn from(task: Task) -> Self {
let Task {
id,
index_uid,
content,
events,
} = task;
let (task_type, mut details) = match content {
TaskContent::DocumentAddition {
merge_strategy,
documents_count,
..
} => {
let details = TaskDetails::DocumentAddition {
received_documents: documents_count,
indexed_documents: None,
};
let task_type = match merge_strategy {
IndexDocumentsMethod::UpdateDocuments => TaskType::DocumentPartial,
IndexDocumentsMethod::ReplaceDocuments => TaskType::DocumentAddition,
_ => unreachable!("Unexpected document merge strategy."),
};
(task_type, Some(details))
}
TaskContent::DocumentDeletion(DocumentDeletion::Ids(ids)) => (
TaskType::DocumentDeletion,
Some(TaskDetails::DocumentDeletion {
received_document_ids: ids.len(),
deleted_documents: None,
}),
),
TaskContent::DocumentDeletion(DocumentDeletion::Clear) => (
TaskType::ClearAll,
Some(TaskDetails::ClearAll {
deleted_documents: None,
}),
),
TaskContent::IndexDeletion => (
TaskType::IndexDeletion,
Some(TaskDetails::ClearAll {
deleted_documents: None,
}),
),
TaskContent::SettingsUpdate { settings, .. } => (
TaskType::SettingsUpdate,
Some(TaskDetails::Settings { settings }),
),
TaskContent::IndexCreation { primary_key } => (
TaskType::IndexCreation,
Some(TaskDetails::IndexInfo { primary_key }),
),
TaskContent::IndexUpdate { primary_key } => (
TaskType::IndexUpdate,
Some(TaskDetails::IndexInfo { primary_key }),
),
};
// An event always has at least one event: "Created"
let (status, error, finished_at) = match events.last().unwrap() {
TaskEvent::Created(_) => (TaskStatus::Enqueued, None, None),
TaskEvent::Batched { .. } => (TaskStatus::Enqueued, None, None),
TaskEvent::Processing(_) => (TaskStatus::Processing, None, None),
TaskEvent::Succeded { timestamp, result } => {
match (result, &mut details) {
(
TaskResult::DocumentAddition {
indexed_documents: num,
..
},
Some(TaskDetails::DocumentAddition {
ref mut indexed_documents,
..
}),
) => {
indexed_documents.replace(*num);
}
(
TaskResult::DocumentDeletion {
deleted_documents: docs,
..
},
Some(TaskDetails::DocumentDeletion {
ref mut deleted_documents,
..
}),
) => {
deleted_documents.replace(*docs);
}
(
TaskResult::ClearAll {
deleted_documents: docs,
},
Some(TaskDetails::ClearAll {
ref mut deleted_documents,
}),
) => {
deleted_documents.replace(*docs);
}
_ => (),
}
(TaskStatus::Succeeded, None, Some(*timestamp))
}
TaskEvent::Failed { timestamp, error } => {
match details {
Some(TaskDetails::DocumentDeletion {
ref mut deleted_documents,
..
}) => {
deleted_documents.replace(0);
}
Some(TaskDetails::ClearAll {
ref mut deleted_documents,
..
}) => {
deleted_documents.replace(0);
}
Some(TaskDetails::DocumentAddition {
ref mut indexed_documents,
..
}) => {
indexed_documents.replace(0);
}
_ => (),
}
(TaskStatus::Failed, Some(error.clone()), Some(*timestamp))
}
};
let enqueued_at = match events.first() {
Some(TaskEvent::Created(ts)) => *ts,
_ => unreachable!("A task must always have a creation event."),
};
let started_at = events.iter().find_map(|e| match e {
TaskEvent::Processing(ts) => Some(*ts),
_ => None,
});
let duration = finished_at.zip(started_at).map(|(tf, ts)| (tf - ts));
Self {
uid: id,
index_uid: index_uid.into_inner(),
status,
task_type,
details,
error,
duration,
enqueued_at,
started_at,
finished_at,
}
}
}
#[derive(Debug, Serialize)]
pub struct TaskListView {
results: Vec<TaskView>,
}
impl From<Vec<TaskView>> for TaskListView {
fn from(results: Vec<TaskView>) -> Self {
Self { results }
}
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct SummarizedTaskView {
uid: TaskId,
index_uid: String,
status: TaskStatus,
#[serde(rename = "type")]
task_type: TaskType,
enqueued_at: DateTime<Utc>,
}
impl From<Task> for SummarizedTaskView {
fn from(mut other: Task) -> Self {
let created_event = other
.events
.drain(..1)
.next()
.expect("Task must have an enqueued event.");
let enqueued_at = match created_event {
TaskEvent::Created(ts) => ts,
_ => unreachable!("The first event of a task must always be 'Created'"),
};
Self {
uid: other.id,
index_uid: other.index_uid.to_string(),
status: TaskStatus::Enqueued,
task_type: other.content.into(),
enqueued_at,
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,630 @@
use crate::common::Server;
use chrono::{Duration, Utc};
use maplit::hashmap;
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use std::collections::{HashMap, HashSet};
static AUTHORIZATIONS: Lazy<HashMap<(&'static str, &'static str), &'static str>> =
Lazy::new(|| {
hashmap! {
("POST", "/indexes/products/search") => "search",
("GET", "/indexes/products/search") => "search",
("POST", "/indexes/products/documents") => "documents.add",
("GET", "/indexes/products/documents") => "documents.get",
("GET", "/indexes/products/documents/0") => "documents.get",
("DELETE", "/indexes/products/documents/0") => "documents.delete",
("GET", "/tasks") => "tasks.get",
("GET", "/indexes/products/tasks") => "tasks.get",
("GET", "/indexes/products/tasks/0") => "tasks.get",
("PUT", "/indexes/products/") => "indexes.update",
("GET", "/indexes/products/") => "indexes.get",
("DELETE", "/indexes/products/") => "indexes.delete",
("POST", "/indexes") => "indexes.create",
("GET", "/indexes") => "indexes.get",
("GET", "/indexes/products/settings") => "settings.get",
("GET", "/indexes/products/settings/displayed-attributes") => "settings.get",
("GET", "/indexes/products/settings/distinct-attribute") => "settings.get",
("GET", "/indexes/products/settings/filterable-attributes") => "settings.get",
("GET", "/indexes/products/settings/ranking-rules") => "settings.get",
("GET", "/indexes/products/settings/searchable-attributes") => "settings.get",
("GET", "/indexes/products/settings/sortable-attributes") => "settings.get",
("GET", "/indexes/products/settings/stop-words") => "settings.get",
("GET", "/indexes/products/settings/synonyms") => "settings.get",
("DELETE", "/indexes/products/settings") => "settings.update",
("POST", "/indexes/products/settings") => "settings.update",
("POST", "/indexes/products/settings/displayed-attributes") => "settings.update",
("POST", "/indexes/products/settings/distinct-attribute") => "settings.update",
("POST", "/indexes/products/settings/filterable-attributes") => "settings.update",
("POST", "/indexes/products/settings/ranking-rules") => "settings.update",
("POST", "/indexes/products/settings/searchable-attributes") => "settings.update",
("POST", "/indexes/products/settings/sortable-attributes") => "settings.update",
("POST", "/indexes/products/settings/stop-words") => "settings.update",
("POST", "/indexes/products/settings/synonyms") => "settings.update",
("GET", "/indexes/products/stats") => "stats.get",
("GET", "/stats") => "stats.get",
("POST", "/dumps") => "dumps.create",
("GET", "/dumps/0/status") => "dumps.get",
("GET", "/version") => "version",
}
});
static ALL_ACTIONS: Lazy<HashSet<&'static str>> =
Lazy::new(|| AUTHORIZATIONS.values().cloned().collect());
static INVALID_RESPONSE: Lazy<Value> = Lazy::new(|| {
json!({"message": "The provided API key is invalid.",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
})
});
#[actix_rt::test]
async fn error_access_expired_key() {
use std::{thread, time};
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let content = json!({
"indexes": ["products"],
"actions": ALL_ACTIONS.clone(),
"expiresAt": (Utc::now() + Duration::seconds(1)),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
// wait until the key is expired.
thread::sleep(time::Duration::new(1, 0));
for (method, route) in AUTHORIZATIONS.keys() {
let (response, code) = server.dummy_request(method, route).await;
assert_eq!(response, INVALID_RESPONSE.clone());
assert_eq!(code, 403);
}
}
#[actix_rt::test]
async fn error_access_unauthorized_index() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let content = json!({
"indexes": ["sales"],
"actions": ALL_ACTIONS.clone(),
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
for (method, route) in AUTHORIZATIONS
.keys()
// filter `products` index routes
.filter(|(_, route)| route.starts_with("/indexes/products"))
{
let (response, code) = server.dummy_request(method, route).await;
assert_eq!(response, INVALID_RESPONSE.clone());
assert_eq!(code, 403);
}
}
#[actix_rt::test]
async fn error_access_unauthorized_action() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let content = json!({
"indexes": ["products"],
"actions": [],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
for ((method, route), action) in AUTHORIZATIONS.iter() {
server.use_api_key("MASTER_KEY");
// Patch API key letting all rights but the needed one.
let content = json!({
"actions": ALL_ACTIONS.iter().cloned().filter(|a| a != action).collect::<Vec<_>>(),
});
let (_, code) = server.patch_api_key(&key, content).await;
assert_eq!(code, 200);
server.use_api_key(&key);
let (response, code) = server.dummy_request(method, route).await;
assert_eq!(response, INVALID_RESPONSE.clone());
assert_eq!(code, 403);
}
}
#[actix_rt::test]
async fn access_authorized_restricted_index() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let content = json!({
"indexes": ["products"],
"actions": [],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
for ((method, route), action) in AUTHORIZATIONS.iter() {
// Patch API key letting only the needed action.
let content = json!({
"actions": [action],
});
server.use_api_key("MASTER_KEY");
let (_, code) = server.patch_api_key(&key, content).await;
assert_eq!(code, 200);
server.use_api_key(&key);
let (response, code) = server.dummy_request(method, route).await;
assert_ne!(response, INVALID_RESPONSE.clone());
assert_ne!(code, 403);
// Patch API key using action all action.
let content = json!({
"actions": ["*"],
});
server.use_api_key("MASTER_KEY");
let (_, code) = server.patch_api_key(&key, content).await;
assert_eq!(code, 200);
server.use_api_key(&key);
let (response, code) = server.dummy_request(method, route).await;
assert_ne!(response, INVALID_RESPONSE.clone());
assert_ne!(code, 403);
}
}
#[actix_rt::test]
async fn access_authorized_no_index_restriction() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let content = json!({
"indexes": ["*"],
"actions": [],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
for ((method, route), action) in AUTHORIZATIONS.iter() {
server.use_api_key("MASTER_KEY");
// Patch API key letting only the needed action.
let content = json!({
"actions": [action],
});
let (_, code) = server.patch_api_key(&key, content).await;
assert_eq!(code, 200);
server.use_api_key(&key);
let (response, code) = server.dummy_request(method, route).await;
assert_ne!(response, INVALID_RESPONSE.clone());
assert_ne!(code, 403);
// Patch API key using action all action.
let content = json!({
"actions": ["*"],
});
server.use_api_key("MASTER_KEY");
let (_, code) = server.patch_api_key(&key, content).await;
assert_eq!(code, 200);
server.use_api_key(&key);
let (response, code) = server.dummy_request(method, route).await;
assert_ne!(response, INVALID_RESPONSE.clone());
assert_ne!(code, 403);
}
}
#[actix_rt::test]
async fn access_authorized_stats_restricted_index() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
// create index `test`
let index = server.index("test");
let (_, code) = index.create(Some("id")).await;
assert_eq!(code, 202);
// create index `products`
let index = server.index("products");
let (_, code) = index.create(Some("product_id")).await;
assert_eq!(code, 202);
index.wait_task(0).await;
// create key with access on `products` index only.
let content = json!({
"indexes": ["products"],
"actions": ["stats.get"],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
let (response, code) = server.stats().await;
assert_eq!(code, 200);
// key should have access on `products` index.
assert!(response["indexes"].get("products").is_some());
// key should not have access on `test` index.
assert!(response["indexes"].get("test").is_none());
}
#[actix_rt::test]
async fn access_authorized_stats_no_index_restriction() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
// create index `test`
let index = server.index("test");
let (_, code) = index.create(Some("id")).await;
assert_eq!(code, 202);
// create index `products`
let index = server.index("products");
let (_, code) = index.create(Some("product_id")).await;
assert_eq!(code, 202);
index.wait_task(0).await;
// create key with access on all indexes.
let content = json!({
"indexes": ["*"],
"actions": ["stats.get"],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
let (response, code) = server.stats().await;
assert_eq!(code, 200);
// key should have access on `products` index.
assert!(response["indexes"].get("products").is_some());
// key should have access on `test` index.
assert!(response["indexes"].get("test").is_some());
}
#[actix_rt::test]
async fn list_authorized_indexes_restricted_index() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
// create index `test`
let index = server.index("test");
let (_, code) = index.create(Some("id")).await;
assert_eq!(code, 202);
// create index `products`
let index = server.index("products");
let (_, code) = index.create(Some("product_id")).await;
assert_eq!(code, 202);
index.wait_task(0).await;
// create key with access on `products` index only.
let content = json!({
"indexes": ["products"],
"actions": ["indexes.get"],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
let (response, code) = server.list_indexes().await;
assert_eq!(code, 200);
let response = response.as_array().unwrap();
// key should have access on `products` index.
assert!(response.iter().any(|index| index["uid"] == "products"));
// key should not have access on `test` index.
assert!(!response.iter().any(|index| index["uid"] == "test"));
}
#[actix_rt::test]
async fn list_authorized_indexes_no_index_restriction() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
// create index `test`
let index = server.index("test");
let (_, code) = index.create(Some("id")).await;
assert_eq!(code, 202);
// create index `products`
let index = server.index("products");
let (_, code) = index.create(Some("product_id")).await;
assert_eq!(code, 202);
index.wait_task(0).await;
// create key with access on all indexes.
let content = json!({
"indexes": ["*"],
"actions": ["indexes.get"],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
let (response, code) = server.list_indexes().await;
assert_eq!(code, 200);
let response = response.as_array().unwrap();
// key should have access on `products` index.
assert!(response.iter().any(|index| index["uid"] == "products"));
// key should have access on `test` index.
assert!(response.iter().any(|index| index["uid"] == "test"));
}
#[actix_rt::test]
async fn list_authorized_tasks_restricted_index() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
// create index `test`
let index = server.index("test");
let (_, code) = index.create(Some("id")).await;
assert_eq!(code, 202);
// create index `products`
let index = server.index("products");
let (_, code) = index.create(Some("product_id")).await;
assert_eq!(code, 202);
index.wait_task(0).await;
// create key with access on `products` index only.
let content = json!({
"indexes": ["products"],
"actions": ["tasks.get"],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
let (response, code) = server.service.get("/tasks").await;
assert_eq!(code, 200);
println!("{}", response);
let response = response["results"].as_array().unwrap();
// key should have access on `products` index.
assert!(response.iter().any(|task| task["indexUid"] == "products"));
// key should not have access on `test` index.
assert!(!response.iter().any(|task| task["indexUid"] == "test"));
}
#[actix_rt::test]
async fn list_authorized_tasks_no_index_restriction() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
// create index `test`
let index = server.index("test");
let (_, code) = index.create(Some("id")).await;
assert_eq!(code, 202);
// create index `products`
let index = server.index("products");
let (_, code) = index.create(Some("product_id")).await;
assert_eq!(code, 202);
index.wait_task(0).await;
// create key with access on all indexes.
let content = json!({
"indexes": ["*"],
"actions": ["tasks.get"],
"expiresAt": Utc::now() + Duration::hours(1),
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
let (response, code) = server.service.get("/tasks").await;
assert_eq!(code, 200);
let response = response["results"].as_array().unwrap();
// key should have access on `products` index.
assert!(response.iter().any(|task| task["indexUid"] == "products"));
// key should have access on `test` index.
assert!(response.iter().any(|task| task["indexUid"] == "test"));
}
#[actix_rt::test]
async fn error_creating_index_without_action() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
// create key with access on all indexes.
let content = json!({
"indexes": ["*"],
"actions": ALL_ACTIONS.iter().cloned().filter(|a| *a != "indexes.create").collect::<Vec<_>>(),
"expiresAt": "2050-11-13T00:00:00Z"
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
let expected_error = json!({
"message": "Index `test` not found.",
"code": "index_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_not_found"
});
// try to create a index via add documents route
let index = server.index("test");
let documents = json!([
{
"id": 1,
"content": "foo",
}
]);
let (response, code) = index.add_documents(documents, None).await;
assert_eq!(code, 202, "{:?}", response);
let task_id = response["uid"].as_u64().unwrap();
let response = index.wait_task(task_id).await;
assert_eq!(response["status"], "failed");
assert_eq!(response["error"], expected_error.clone());
// try to create a index via add settings route
let settings = json!({ "distinctAttribute": "test"});
let (response, code) = index.update_settings(settings).await;
assert_eq!(code, 202);
let task_id = response["uid"].as_u64().unwrap();
let response = index.wait_task(task_id).await;
assert_eq!(response["status"], "failed");
assert_eq!(response["error"], expected_error.clone());
// try to create a index via add specialized settings route
let (response, code) = index.update_distinct_attribute(json!("test")).await;
assert_eq!(code, 202);
let task_id = response["uid"].as_u64().unwrap();
let response = index.wait_task(task_id).await;
assert_eq!(response["status"], "failed");
assert_eq!(response["error"], expected_error.clone());
}
#[actix_rt::test]
async fn lazy_create_index() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
// create key with access on all indexes.
let content = json!({
"indexes": ["*"],
"actions": ["*"],
"expiresAt": "2050-11-13T00:00:00Z"
});
let (response, code) = server.add_api_key(content).await;
assert_eq!(code, 201);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(&key);
// try to create a index via add documents route
let index = server.index("test");
let documents = json!([
{
"id": 1,
"content": "foo",
}
]);
let (response, code) = index.add_documents(documents, None).await;
assert_eq!(code, 202, "{:?}", response);
let task_id = response["uid"].as_u64().unwrap();
index.wait_task(task_id).await;
let (response, code) = index.get_task(task_id).await;
assert_eq!(code, 200);
assert_eq!(response["status"], "succeeded");
// try to create a index via add settings route
let index = server.index("test1");
let settings = json!({ "distinctAttribute": "test"});
let (response, code) = index.update_settings(settings).await;
assert_eq!(code, 202);
let task_id = response["uid"].as_u64().unwrap();
index.wait_task(task_id).await;
let (response, code) = index.get_task(task_id).await;
assert_eq!(code, 200);
assert_eq!(response["status"], "succeeded");
// try to create a index via add specialized settings route
let index = server.index("test2");
let (response, code) = index.update_distinct_attribute(json!("test")).await;
assert_eq!(code, 202);
let task_id = response["uid"].as_u64().unwrap();
index.wait_task(task_id).await;
let (response, code) = index.get_task(task_id).await;
assert_eq!(code, 200);
assert_eq!(response["status"], "succeeded");
}

View File

@@ -0,0 +1,54 @@
mod api_keys;
mod authorization;
mod payload;
use crate::common::Server;
use actix_web::http::StatusCode;
use serde_json::{json, Value};
impl Server {
pub fn use_api_key(&mut self, api_key: impl AsRef<str>) {
self.service.api_key = Some(api_key.as_ref().to_string());
}
pub async fn add_api_key(&self, content: Value) -> (Value, StatusCode) {
let url = "/keys";
self.service.post(url, content).await
}
pub async fn get_api_key(&self, key: impl AsRef<str>) -> (Value, StatusCode) {
let url = format!("/keys/{}", key.as_ref());
self.service.get(url).await
}
pub async fn patch_api_key(&self, key: impl AsRef<str>, content: Value) -> (Value, StatusCode) {
let url = format!("/keys/{}", key.as_ref());
self.service.patch(url, content).await
}
pub async fn list_api_keys(&self) -> (Value, StatusCode) {
let url = "/keys";
self.service.get(url).await
}
pub async fn delete_api_key(&self, key: impl AsRef<str>) -> (Value, StatusCode) {
let url = format!("/keys/{}", key.as_ref());
self.service.delete(url).await
}
pub async fn dummy_request(
&self,
method: impl AsRef<str>,
url: impl AsRef<str>,
) -> (Value, StatusCode) {
match method.as_ref() {
"POST" => self.service.post(url, json!({})).await,
"PUT" => self.service.put(url, json!({})).await,
"PATCH" => self.service.patch(url, json!({})).await,
"GET" => self.service.get(url).await,
"DELETE" => self.service.delete(url).await,
_ => unreachable!(),
}
}
}

View File

@@ -0,0 +1,340 @@
use crate::common::Server;
use actix_web::test;
use meilisearch_http::{analytics, create_app};
use serde_json::{json, Value};
#[actix_rt::test]
async fn error_api_key_bad_content_types() {
let content = json!({
"indexes": ["products"],
"actions": [
"documents.add"
],
"expiresAt": "2050-11-13T00:00:00Z"
});
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let app = test::init_service(create_app!(
&server.service.meilisearch,
&server.service.auth,
true,
&server.service.options,
analytics::MockAnalytics::new(&server.service.options).0
))
.await;
// post
let req = test::TestRequest::post()
.uri("/keys")
.set_payload(content.to_string())
.insert_header(("content-type", "text/plain"))
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 415);
assert_eq!(
response["message"],
json!(
r#"The Content-Type `text/plain` is invalid. Accepted values for the Content-Type header are: `application/json`"#
)
);
assert_eq!(response["code"], "invalid_content_type");
assert_eq!(response["type"], "invalid_request");
assert_eq!(
response["link"],
"https://docs.meilisearch.com/errors#invalid_content_type"
);
// patch
let req = test::TestRequest::patch()
.uri("/keys/d0552b41536279a0ad88bd595327b96f01176a60c2243e906c52ac02375f9bc4")
.set_payload(content.to_string())
.insert_header(("content-type", "text/plain"))
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 415);
assert_eq!(
response["message"],
json!(
r#"The Content-Type `text/plain` is invalid. Accepted values for the Content-Type header are: `application/json`"#
)
);
assert_eq!(response["code"], "invalid_content_type");
assert_eq!(response["type"], "invalid_request");
assert_eq!(
response["link"],
"https://docs.meilisearch.com/errors#invalid_content_type"
);
}
#[actix_rt::test]
async fn error_api_key_empty_content_types() {
let content = json!({
"indexes": ["products"],
"actions": [
"documents.add"
],
"expiresAt": "2050-11-13T00:00:00Z"
});
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let app = test::init_service(create_app!(
&server.service.meilisearch,
&server.service.auth,
true,
&server.service.options,
analytics::MockAnalytics::new(&server.service.options).0
))
.await;
// post
let req = test::TestRequest::post()
.uri("/keys")
.set_payload(content.to_string())
.insert_header(("content-type", ""))
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 415);
assert_eq!(
response["message"],
json!(
r#"The Content-Type `` is invalid. Accepted values for the Content-Type header are: `application/json`"#
)
);
assert_eq!(response["code"], "invalid_content_type");
assert_eq!(response["type"], "invalid_request");
assert_eq!(
response["link"],
"https://docs.meilisearch.com/errors#invalid_content_type"
);
// patch
let req = test::TestRequest::patch()
.uri("/keys/d0552b41536279a0ad88bd595327b96f01176a60c2243e906c52ac02375f9bc4")
.set_payload(content.to_string())
.insert_header(("content-type", ""))
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 415);
assert_eq!(
response["message"],
json!(
r#"The Content-Type `` is invalid. Accepted values for the Content-Type header are: `application/json`"#
)
);
assert_eq!(response["code"], "invalid_content_type");
assert_eq!(response["type"], "invalid_request");
assert_eq!(
response["link"],
"https://docs.meilisearch.com/errors#invalid_content_type"
);
}
#[actix_rt::test]
async fn error_api_key_missing_content_types() {
let content = json!({
"indexes": ["products"],
"actions": [
"documents.add"
],
"expiresAt": "2050-11-13T00:00:00Z"
});
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let app = test::init_service(create_app!(
&server.service.meilisearch,
&server.service.auth,
true,
&server.service.options,
analytics::MockAnalytics::new(&server.service.options).0
))
.await;
// post
let req = test::TestRequest::post()
.uri("/keys")
.set_payload(content.to_string())
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 415);
assert_eq!(
response["message"],
json!(
r#"A Content-Type header is missing. Accepted values for the Content-Type header are: `application/json`"#
)
);
assert_eq!(response["code"], "missing_content_type");
assert_eq!(response["type"], "invalid_request");
assert_eq!(
response["link"],
"https://docs.meilisearch.com/errors#missing_content_type"
);
// patch
let req = test::TestRequest::patch()
.uri("/keys/d0552b41536279a0ad88bd595327b96f01176a60c2243e906c52ac02375f9bc4")
.set_payload(content.to_string())
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 415);
assert_eq!(
response["message"],
json!(
r#"A Content-Type header is missing. Accepted values for the Content-Type header are: `application/json`"#
)
);
assert_eq!(response["code"], "missing_content_type");
assert_eq!(response["type"], "invalid_request");
assert_eq!(
response["link"],
"https://docs.meilisearch.com/errors#missing_content_type"
);
}
#[actix_rt::test]
async fn error_api_key_empty_payload() {
let content = "";
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let app = test::init_service(create_app!(
&server.service.meilisearch,
&server.service.auth,
true,
&server.service.options,
analytics::MockAnalytics::new(&server.service.options).0
))
.await;
// post
let req = test::TestRequest::post()
.uri("/keys")
.set_payload(content)
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.insert_header(("content-type", "application/json"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 400);
assert_eq!(response["code"], json!("missing_payload"));
assert_eq!(response["type"], json!("invalid_request"));
assert_eq!(
response["link"],
json!("https://docs.meilisearch.com/errors#missing_payload")
);
assert_eq!(response["message"], json!(r#"A json payload is missing."#));
// patch
let req = test::TestRequest::patch()
.uri("/keys/d0552b41536279a0ad88bd595327b96f01176a60c2243e906c52ac02375f9bc4")
.set_payload(content)
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.insert_header(("content-type", "application/json"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 400);
assert_eq!(response["code"], json!("missing_payload"));
assert_eq!(response["type"], json!("invalid_request"));
assert_eq!(
response["link"],
json!("https://docs.meilisearch.com/errors#missing_payload")
);
assert_eq!(response["message"], json!(r#"A json payload is missing."#));
}
#[actix_rt::test]
async fn error_api_key_malformed_payload() {
let content = r#"{"malormed": "payload""#;
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let app = test::init_service(create_app!(
&server.service.meilisearch,
&server.service.auth,
true,
&server.service.options,
analytics::MockAnalytics::new(&server.service.options).0
))
.await;
// post
let req = test::TestRequest::post()
.uri("/keys")
.set_payload(content)
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.insert_header(("content-type", "application/json"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 400);
assert_eq!(response["code"], json!("malformed_payload"));
assert_eq!(response["type"], json!("invalid_request"));
assert_eq!(
response["link"],
json!("https://docs.meilisearch.com/errors#malformed_payload")
);
assert_eq!(
response["message"],
json!(
r#"The json payload provided is malformed. `EOF while parsing an object at line 1 column 22`."#
)
);
// patch
let req = test::TestRequest::patch()
.uri("/keys/d0552b41536279a0ad88bd595327b96f01176a60c2243e906c52ac02375f9bc4")
.set_payload(content)
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.insert_header(("content-type", "application/json"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 400);
assert_eq!(response["code"], json!("malformed_payload"));
assert_eq!(response["type"], json!("invalid_request"));
assert_eq!(
response["link"],
json!("https://docs.meilisearch.com/errors#malformed_payload")
);
assert_eq!(
response["message"],
json!(
r#"The json payload provided is malformed. `EOF while parsing an object at line 1 column 22`."#
)
);
}

View File

@@ -7,6 +7,7 @@ use actix_web::http::StatusCode;
use paste::paste;
use serde_json::{json, Value};
use tokio::time::sleep;
use urlencoding::encode;
use super::service::Service;
@@ -14,12 +15,12 @@ macro_rules! make_settings_test_routes {
($($name:ident),+) => {
$(paste! {
pub async fn [<update_$name>](&self, value: Value) -> (Value, StatusCode) {
let url = format!("/indexes/{}/settings/{}", self.uid, stringify!($name).replace("_", "-"));
let url = format!("/indexes/{}/settings/{}", encode(self.uid.as_ref()).to_string(), stringify!($name).replace("_", "-"));
self.service.post(url, value).await
}
pub async fn [<get_$name>](&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}/settings/{}", self.uid, stringify!($name).replace("_", "-"));
let url = format!("/indexes/{}/settings/{}", encode(self.uid.as_ref()).to_string(), stringify!($name).replace("_", "-"));
self.service.get(url).await
}
})*
@@ -34,19 +35,19 @@ pub struct Index<'a> {
#[allow(dead_code)]
impl Index<'_> {
pub async fn get(&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}", self.uid);
let url = format!("/indexes/{}", encode(self.uid.as_ref()));
self.service.get(url).await
}
pub async fn load_test_set(&self) -> u64 {
let url = format!("/indexes/{}/documents", self.uid);
let url = format!("/indexes/{}/documents", encode(self.uid.as_ref()));
let (response, code) = self
.service
.post_str(url, include_str!("../assets/test_set.json"))
.await;
assert_eq!(code, 202);
let update_id = response["updateId"].as_i64().unwrap();
self.wait_update_id(update_id as u64).await;
let update_id = response["uid"].as_i64().unwrap();
self.wait_task(update_id as u64).await;
update_id as u64
}
@@ -62,13 +63,13 @@ impl Index<'_> {
let body = json!({
"primaryKey": primary_key,
});
let url = format!("/indexes/{}", self.uid);
let url = format!("/indexes/{}", encode(self.uid.as_ref()));
self.service.put(url, body).await
}
pub async fn delete(&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}", self.uid);
let url = format!("/indexes/{}", encode(self.uid.as_ref()));
self.service.delete(url).await
}
@@ -78,8 +79,12 @@ impl Index<'_> {
primary_key: Option<&str>,
) -> (Value, StatusCode) {
let url = match primary_key {
Some(key) => format!("/indexes/{}/documents?primaryKey={}", self.uid, key),
None => format!("/indexes/{}/documents", self.uid),
Some(key) => format!(
"/indexes/{}/documents?primaryKey={}",
encode(self.uid.as_ref()),
key
),
None => format!("/indexes/{}/documents", encode(self.uid.as_ref())),
};
self.service.post(url, documents).await
}
@@ -90,20 +95,24 @@ impl Index<'_> {
primary_key: Option<&str>,
) -> (Value, StatusCode) {
let url = match primary_key {
Some(key) => format!("/indexes/{}/documents?primaryKey={}", self.uid, key),
None => format!("/indexes/{}/documents", self.uid),
Some(key) => format!(
"/indexes/{}/documents?primaryKey={}",
encode(self.uid.as_ref()),
key
),
None => format!("/indexes/{}/documents", encode(self.uid.as_ref())),
};
self.service.put(url, documents).await
}
pub async fn wait_update_id(&self, update_id: u64) -> Value {
pub async fn wait_task(&self, update_id: u64) -> Value {
// try 10 times to get status, or panic to not wait forever
let url = format!("/indexes/{}/updates/{}", self.uid, update_id);
let url = format!("/tasks/{}", update_id);
for _ in 0..10 {
let (response, status_code) = self.service.get(&url).await;
assert_eq!(status_code, 200, "response: {}", response);
if response["status"] == "processed" || response["status"] == "failed" {
if response["status"] == "succeeded" || response["status"] == "failed" {
return response;
}
@@ -112,13 +121,13 @@ impl Index<'_> {
panic!("Timeout waiting for update id");
}
pub async fn get_update(&self, update_id: u64) -> (Value, StatusCode) {
let url = format!("/indexes/{}/updates/{}", self.uid, update_id);
pub async fn get_task(&self, update_id: u64) -> (Value, StatusCode) {
let url = format!("/indexes/{}/tasks/{}", self.uid, update_id);
self.service.get(url).await
}
pub async fn list_updates(&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}/updates", self.uid);
pub async fn list_tasks(&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}/tasks", self.uid);
self.service.get(url).await
}
@@ -127,12 +136,12 @@ impl Index<'_> {
id: u64,
_options: Option<GetDocumentOptions>,
) -> (Value, StatusCode) {
let url = format!("/indexes/{}/documents/{}", self.uid, id);
let url = format!("/indexes/{}/documents/{}", encode(self.uid.as_ref()), id);
self.service.get(url).await
}
pub async fn get_all_documents(&self, options: GetAllDocumentsOptions) -> (Value, StatusCode) {
let mut url = format!("/indexes/{}/documents?", self.uid);
let mut url = format!("/indexes/{}/documents?", encode(self.uid.as_ref()));
if let Some(limit) = options.limit {
url.push_str(&format!("limit={}&", limit));
}
@@ -152,39 +161,42 @@ impl Index<'_> {
}
pub async fn delete_document(&self, id: u64) -> (Value, StatusCode) {
let url = format!("/indexes/{}/documents/{}", self.uid, id);
let url = format!("/indexes/{}/documents/{}", encode(self.uid.as_ref()), id);
self.service.delete(url).await
}
pub async fn clear_all_documents(&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}/documents", self.uid);
let url = format!("/indexes/{}/documents", encode(self.uid.as_ref()));
self.service.delete(url).await
}
pub async fn delete_batch(&self, ids: Vec<u64>) -> (Value, StatusCode) {
let url = format!("/indexes/{}/documents/delete-batch", self.uid);
let url = format!(
"/indexes/{}/documents/delete-batch",
encode(self.uid.as_ref())
);
self.service
.post(url, serde_json::to_value(&ids).unwrap())
.await
}
pub async fn settings(&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}/settings", self.uid);
let url = format!("/indexes/{}/settings", encode(self.uid.as_ref()));
self.service.get(url).await
}
pub async fn update_settings(&self, settings: Value) -> (Value, StatusCode) {
let url = format!("/indexes/{}/settings", self.uid);
let url = format!("/indexes/{}/settings", encode(self.uid.as_ref()));
self.service.post(url, settings).await
}
pub async fn delete_settings(&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}/settings", self.uid);
let url = format!("/indexes/{}/settings", encode(self.uid.as_ref()));
self.service.delete(url).await
}
pub async fn stats(&self) -> (Value, StatusCode) {
let url = format!("/indexes/{}/stats", self.uid);
let url = format!("/indexes/{}/stats", encode(self.uid.as_ref()));
self.service.get(url).await
}
@@ -209,13 +221,13 @@ impl Index<'_> {
}
pub async fn search_post(&self, query: Value) -> (Value, StatusCode) {
let url = format!("/indexes/{}/search", self.uid);
let url = format!("/indexes/{}/search", encode(self.uid.as_ref()));
self.service.post(url, query).await
}
pub async fn search_get(&self, query: Value) -> (Value, StatusCode) {
let params = serde_url_params::to_string(&query).unwrap();
let url = format!("/indexes/{}/search?{}", self.uid, params);
let url = format!("/indexes/{}/search?{}", encode(self.uid.as_ref()), params);
self.service.get(url).await
}

View File

@@ -1,13 +1,16 @@
#![allow(dead_code)]
use std::path::Path;
use actix_web::http::StatusCode;
use byte_unit::{Byte, ByteUnit};
use meilisearch_auth::AuthController;
use meilisearch_http::setup_meilisearch;
use meilisearch_lib::options::{IndexerOpts, MaxMemory};
use once_cell::sync::Lazy;
use serde_json::Value;
use tempdir::TempDir;
use urlencoding::encode;
use tempfile::TempDir;
use meilisearch_http::data::Data;
use meilisearch_http::option::{IndexerOpts, MaxMemory, Opt};
use meilisearch_http::option::Opt;
use super::index::Index;
use super::service::Service;
@@ -15,17 +18,31 @@ use super::service::Service;
pub struct Server {
pub service: Service,
// hold ownership to the tempdir while we use the server instance.
_dir: Option<tempdir::TempDir>,
_dir: Option<TempDir>,
}
pub static TEST_TEMP_DIR: Lazy<TempDir> = Lazy::new(|| TempDir::new().unwrap());
impl Server {
pub async fn new() -> Self {
let dir = TempDir::new("meilisearch").unwrap();
let dir = TempDir::new().unwrap();
let opt = default_settings(dir.path());
if cfg!(windows) {
std::env::set_var("TMP", TEST_TEMP_DIR.path());
} else {
std::env::set_var("TMPDIR", TEST_TEMP_DIR.path());
}
let data = Data::new(opt).unwrap();
let service = Service(data);
let options = default_settings(dir.path());
let meilisearch = setup_meilisearch(&options).unwrap();
let auth = AuthController::new(&options.db_path, &options.master_key).unwrap();
let service = Service {
meilisearch,
auth,
options,
api_key: None,
};
Server {
service,
@@ -33,9 +50,42 @@ impl Server {
}
}
pub async fn new_with_options(opt: Opt) -> Self {
let data = Data::new(opt).unwrap();
let service = Service(data);
pub async fn new_auth() -> Self {
let dir = TempDir::new().unwrap();
if cfg!(windows) {
std::env::set_var("TMP", TEST_TEMP_DIR.path());
} else {
std::env::set_var("TMPDIR", TEST_TEMP_DIR.path());
}
let mut options = default_settings(dir.path());
options.master_key = Some("MASTER_KEY".to_string());
let meilisearch = setup_meilisearch(&options).unwrap();
let auth = AuthController::new(&options.db_path, &options.master_key).unwrap();
let service = Service {
meilisearch,
auth,
options,
api_key: None,
};
Server {
service,
_dir: Some(dir),
}
}
pub async fn new_with_options(options: Opt) -> Self {
let meilisearch = setup_meilisearch(&options).unwrap();
let auth = AuthController::new(&options.db_path, &options.master_key).unwrap();
let service = Service {
meilisearch,
auth,
options,
api_key: None,
};
Server {
service,
@@ -46,7 +96,7 @@ impl Server {
/// Returns a view to an index. There is no guarantee that the index exists.
pub fn index(&self, uid: impl AsRef<str>) -> Index<'_> {
Index {
uid: encode(uid.as_ref()),
uid: uid.as_ref().to_string(),
service: &self.service,
}
}
@@ -62,6 +112,14 @@ impl Server {
pub async fn stats(&self) -> (Value, StatusCode) {
self.service.get("/stats").await
}
pub async fn tasks(&self) -> (Value, StatusCode) {
self.service.get("/tasks").await
}
pub async fn get_dump_status(&self, uid: &str) -> (Value, StatusCode) {
self.service.get(format!("/dumps/{}/status", uid)).await
}
}
pub fn default_settings(dir: impl AsRef<Path>) -> Opt {
@@ -72,9 +130,9 @@ pub fn default_settings(dir: impl AsRef<Path>) -> Opt {
master_key: None,
env: "development".to_owned(),
#[cfg(all(not(debug_assertions), feature = "analytics"))]
no_analytics: true,
no_analytics: Some(Some(true)),
max_index_size: Byte::from_unit(4.0, ByteUnit::GiB).unwrap(),
max_udb_size: Byte::from_unit(4.0, ByteUnit::GiB).unwrap(),
max_task_db_size: Byte::from_unit(4.0, ByteUnit::GiB).unwrap(),
http_payload_size_limit: Byte::from_unit(10.0, ByteUnit::MiB).unwrap(),
ssl_cert_path: None,
ssl_key_path: None,

View File

@@ -1,19 +1,33 @@
use actix_web::{http::StatusCode, test};
use meilisearch_auth::AuthController;
use meilisearch_lib::MeiliSearch;
use serde_json::Value;
use meilisearch_http::create_app;
use meilisearch_http::data::Data;
use meilisearch_http::{analytics, create_app, Opt};
pub struct Service(pub Data);
pub struct Service {
pub meilisearch: MeiliSearch,
pub auth: AuthController,
pub options: Opt,
pub api_key: Option<String>,
}
impl Service {
pub async fn post(&self, url: impl AsRef<str>, body: Value) -> (Value, StatusCode) {
let app = test::init_service(create_app!(&self.0, true)).await;
let app = test::init_service(create_app!(
&self.meilisearch,
&self.auth,
true,
&self.options,
analytics::MockAnalytics::new(&self.options).0
))
.await;
let req = test::TestRequest::post()
.uri(url.as_ref())
.set_json(&body)
.to_request();
let mut req = test::TestRequest::post().uri(url.as_ref()).set_json(&body);
if let Some(api_key) = &self.api_key {
req = req.insert_header(("Authorization", ["Bearer ", api_key].concat()));
}
let req = req.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
@@ -28,13 +42,23 @@ impl Service {
url: impl AsRef<str>,
body: impl AsRef<str>,
) -> (Value, StatusCode) {
let app = test::init_service(create_app!(&self.0, true)).await;
let app = test::init_service(create_app!(
&self.meilisearch,
&self.auth,
true,
&self.options,
analytics::MockAnalytics::new(&self.options).0
))
.await;
let req = test::TestRequest::post()
let mut req = test::TestRequest::post()
.uri(url.as_ref())
.set_payload(body.as_ref().to_string())
.insert_header(("content-type", "application/json"))
.to_request();
.insert_header(("content-type", "application/json"));
if let Some(api_key) = &self.api_key {
req = req.insert_header(("Authorization", ["Bearer ", api_key].concat()));
}
let req = req.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
@@ -44,9 +68,20 @@ impl Service {
}
pub async fn get(&self, url: impl AsRef<str>) -> (Value, StatusCode) {
let app = test::init_service(create_app!(&self.0, true)).await;
let app = test::init_service(create_app!(
&self.meilisearch,
&self.auth,
true,
&self.options,
analytics::MockAnalytics::new(&self.options).0
))
.await;
let req = test::TestRequest::get().uri(url.as_ref()).to_request();
let mut req = test::TestRequest::get().uri(url.as_ref());
if let Some(api_key) = &self.api_key {
req = req.insert_header(("Authorization", ["Bearer ", api_key].concat()));
}
let req = req.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
@@ -56,12 +91,43 @@ impl Service {
}
pub async fn put(&self, url: impl AsRef<str>, body: Value) -> (Value, StatusCode) {
let app = test::init_service(create_app!(&self.0, true)).await;
let app = test::init_service(create_app!(
&self.meilisearch,
&self.auth,
true,
&self.options,
analytics::MockAnalytics::new(&self.options).0
))
.await;
let req = test::TestRequest::put()
.uri(url.as_ref())
.set_json(&body)
.to_request();
let mut req = test::TestRequest::put().uri(url.as_ref()).set_json(&body);
if let Some(api_key) = &self.api_key {
req = req.insert_header(("Authorization", ["Bearer ", api_key].concat()));
}
let req = req.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response = serde_json::from_slice(&body).unwrap_or_default();
(response, status_code)
}
pub async fn patch(&self, url: impl AsRef<str>, body: Value) -> (Value, StatusCode) {
let app = test::init_service(create_app!(
&self.meilisearch,
&self.auth,
true,
&self.options,
analytics::MockAnalytics::new(&self.options).0
))
.await;
let mut req = test::TestRequest::patch().uri(url.as_ref()).set_json(&body);
if let Some(api_key) = &self.api_key {
req = req.insert_header(("Authorization", ["Bearer ", api_key].concat()));
}
let req = req.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
@@ -71,9 +137,20 @@ impl Service {
}
pub async fn delete(&self, url: impl AsRef<str>) -> (Value, StatusCode) {
let app = test::init_service(create_app!(&self.0, true)).await;
let app = test::init_service(create_app!(
&self.meilisearch,
&self.auth,
true,
&self.options,
analytics::MockAnalytics::new(&self.options).0
))
.await;
let req = test::TestRequest::delete().uri(url.as_ref()).to_request();
let mut req = test::TestRequest::delete().uri(url.as_ref());
if let Some(api_key) = &self.api_key {
req = req.insert_header(("Authorization", ["Bearer ", api_key].concat()));
}
let req = req.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();

View File

@@ -0,0 +1,150 @@
#![allow(dead_code)]
mod common;
use crate::common::Server;
use actix_web::test;
use meilisearch_http::{analytics, create_app};
use serde_json::{json, Value};
#[actix_rt::test]
async fn error_json_bad_content_type() {
let routes = [
// all the POST routes except the dumps that can be created without any body or content-type
// and the search that is not a strict json
"/indexes",
"/indexes/doggo/documents/delete-batch",
"/indexes/doggo/search",
"/indexes/doggo/settings",
"/indexes/doggo/settings/displayed-attributes",
"/indexes/doggo/settings/distinct-attribute",
"/indexes/doggo/settings/filterable-attributes",
"/indexes/doggo/settings/ranking-rules",
"/indexes/doggo/settings/searchable-attributes",
"/indexes/doggo/settings/sortable-attributes",
"/indexes/doggo/settings/stop-words",
"/indexes/doggo/settings/synonyms",
];
let bad_content_types = [
"application/csv",
"application/x-ndjson",
"application/x-www-form-urlencoded",
"text/plain",
"json",
"application",
"json/application",
];
let document = "{}";
let server = Server::new().await;
let app = test::init_service(create_app!(
&server.service.meilisearch,
&server.service.auth,
true,
&server.service.options,
analytics::MockAnalytics::new(&server.service.options).0
))
.await;
for route in routes {
// Good content-type, we probably have an error since we didn't send anything in the json
// so we only ensure we didn't get a bad media type error.
let req = test::TestRequest::post()
.uri(route)
.set_payload(document)
.insert_header(("content-type", "application/json"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
assert_ne!(status_code, 415,
"calling the route `{}` with a content-type of json isn't supposed to throw a bad media type error", route);
// No content-type.
let req = test::TestRequest::post()
.uri(route)
.set_payload(document)
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 415, "calling the route `{}` without content-type is supposed to throw a bad media type error", route);
assert_eq!(
response,
json!({
"message": r#"A Content-Type header is missing. Accepted values for the Content-Type header are: `application/json`"#,
"code": "missing_content_type",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#missing_content_type",
}),
"when calling the route `{}` with no content-type",
route,
);
for bad_content_type in bad_content_types {
// Always bad content-type
let req = test::TestRequest::post()
.uri(route)
.set_payload(document.to_string())
.insert_header(("content-type", bad_content_type))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
assert_eq!(status_code, 415);
let expected_error_message = format!(
r#"The Content-Type `{}` is invalid. Accepted values for the Content-Type header are: `application/json`"#,
bad_content_type
);
assert_eq!(
response,
json!({
"message": expected_error_message,
"code": "invalid_content_type",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid_content_type",
}),
"when calling the route `{}` with a content-type of `{}`",
route,
bad_content_type,
);
}
}
}
#[actix_rt::test]
async fn extract_actual_content_type() {
let route = "/indexes/doggo/documents";
let documents = "[{}]";
let server = Server::new().await;
let app = test::init_service(create_app!(
&server.service.meilisearch,
&server.service.auth,
true,
&server.service.options,
analytics::MockAnalytics::new(&server.service.options).0
))
.await;
// Good content-type, we probably have an error since we didn't send anything in the json
// so we only ensure we didn't get a bad media type error.
let req = test::TestRequest::post()
.uri(route)
.set_payload(documents)
.insert_header(("content-type", "application/json; charset=utf-8"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
assert_ne!(status_code, 415,
"calling the route `{}` with a content-type of json isn't supposed to throw a bad media type error", route);
let req = test::TestRequest::put()
.uri(route)
.set_payload(documents)
.insert_header(("content-type", "application/json; charset=latin-1"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
assert_ne!(status_code, 415,
"calling the route `{}` with a content-type of json isn't supposed to throw a bad media type error", route);
}

View File

@@ -0,0 +1,24 @@
use crate::common::Server;
#[actix_rt::test]
async fn dashboard_assets_load() {
let server = Server::new().await;
mod generated {
include!(concat!(env!("OUT_DIR"), "/generated.rs"));
}
let generated = generated::generate();
for (path, _) in generated.into_iter() {
let path = if path == "index.html" {
// "index.html" redirects to "/"
"/".to_owned()
} else {
"/".to_owned() + path
};
let (_, status_code) = server.service.get(&path).await;
assert_eq!(status_code, 200);
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -5,8 +5,13 @@ use crate::common::{GetAllDocumentsOptions, Server};
#[actix_rt::test]
async fn delete_one_document_unexisting_index() {
let server = Server::new().await;
let (_response, code) = server.index("test").delete_document(0).await;
assert_eq!(code, 404);
let index = server.index("test");
let (_response, code) = index.delete_document(0).await;
assert_eq!(code, 202);
let response = index.wait_task(0).await;
assert_eq!(response["status"], "failed");
}
#[actix_rt::test]
@@ -16,8 +21,8 @@ async fn delete_one_unexisting_document() {
index.create(None).await;
let (response, code) = index.delete_document(0).await;
assert_eq!(code, 202, "{}", response);
let update = index.wait_update_id(0).await;
assert_eq!(update["status"], "processed");
let update = index.wait_task(0).await;
assert_eq!(update["status"], "succeeded");
}
#[actix_rt::test]
@@ -27,10 +32,10 @@ async fn delete_one_document() {
index
.add_documents(json!([{ "id": 0, "content": "foobar" }]), None)
.await;
index.wait_update_id(0).await;
index.wait_task(0).await;
let (_response, code) = server.index("test").delete_document(0).await;
assert_eq!(code, 202);
index.wait_update_id(1).await;
index.wait_task(1).await;
let (_response, code) = index.get_document(0, None).await;
assert_eq!(code, 404);
@@ -39,8 +44,13 @@ async fn delete_one_document() {
#[actix_rt::test]
async fn clear_all_documents_unexisting_index() {
let server = Server::new().await;
let (_response, code) = server.index("test").clear_all_documents().await;
assert_eq!(code, 404);
let index = server.index("test");
let (_response, code) = index.clear_all_documents().await;
assert_eq!(code, 202);
let response = index.wait_task(0).await;
assert_eq!(response["status"], "failed");
}
#[actix_rt::test]
@@ -53,11 +63,11 @@ async fn clear_all_documents() {
None,
)
.await;
index.wait_update_id(0).await;
index.wait_task(0).await;
let (_response, code) = index.clear_all_documents().await;
assert_eq!(code, 202);
let _update = index.wait_update_id(1).await;
let _update = index.wait_task(1).await;
let (response, code) = index
.get_all_documents(GetAllDocumentsOptions::default())
.await;
@@ -74,7 +84,7 @@ async fn clear_all_documents_empty_index() {
let (_response, code) = index.clear_all_documents().await;
assert_eq!(code, 202);
let _update = index.wait_update_id(0).await;
let _update = index.wait_task(0).await;
let (response, code) = index
.get_all_documents(GetAllDocumentsOptions::default())
.await;
@@ -83,10 +93,22 @@ async fn clear_all_documents_empty_index() {
}
#[actix_rt::test]
async fn delete_batch_unexisting_index() {
async fn error_delete_batch_unexisting_index() {
let server = Server::new().await;
let (response, code) = server.index("test").delete_batch(vec![]).await;
assert_eq!(code, 404, "{}", response);
let index = server.index("test");
let (_, code) = index.delete_batch(vec![]).await;
let expected_response = json!({
"message": "Index `test` not found.",
"code": "index_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_not_found"
});
assert_eq!(code, 202);
let response = index.wait_task(0).await;
assert_eq!(response["status"], "failed");
assert_eq!(response["error"], expected_response);
}
#[actix_rt::test]
@@ -94,11 +116,11 @@ async fn delete_batch() {
let server = Server::new().await;
let index = server.index("test");
index.add_documents(json!([{ "id": 1, "content": "foobar" }, { "id": 0, "content": "foobar" }, { "id": 3, "content": "foobar" }]), Some("id")).await;
index.wait_update_id(0).await;
index.wait_task(0).await;
let (_response, code) = index.delete_batch(vec![1, 0]).await;
assert_eq!(code, 202);
let _update = index.wait_update_id(1).await;
let _update = index.wait_task(1).await;
let (response, code) = index
.get_all_documents(GetAllDocumentsOptions::default())
.await;
@@ -112,11 +134,11 @@ async fn delete_no_document_batch() {
let server = Server::new().await;
let index = server.index("test");
index.add_documents(json!([{ "id": 1, "content": "foobar" }, { "id": 0, "content": "foobar" }, { "id": 3, "content": "foobar" }]), Some("id")).await;
index.wait_update_id(0).await;
index.wait_task(0).await;
let (_response, code) = index.delete_batch(vec![]).await;
assert_eq!(code, 202, "{}", _response);
let _update = index.wait_update_id(1).await;
let _update = index.wait_task(1).await;
let (response, code) = index
.get_all_documents(GetAllDocumentsOptions::default())
.await;

View File

@@ -13,11 +13,21 @@ async fn get_unexisting_index_single_document() {
}
#[actix_rt::test]
async fn get_unexisting_document() {
async fn error_get_unexisting_document() {
let server = Server::new().await;
let index = server.index("test");
index.create(None).await;
let (_response, code) = index.get_document(1, None).await;
index.wait_task(0).await;
let (response, code) = index.get_document(1, None).await;
let expected_response = json!({
"message": "Document `1` not found.",
"code": "document_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#document_not_found"
});
assert_eq!(response, expected_response);
assert_eq!(code, 404);
}
@@ -34,7 +44,7 @@ async fn get_document() {
]);
let (_, code) = index.add_documents(documents, None).await;
assert_eq!(code, 202);
index.wait_update_id(0).await;
index.wait_task(0).await;
let (response, code) = index.get_document(0, None).await;
assert_eq!(code, 200);
assert_eq!(
@@ -47,21 +57,32 @@ async fn get_document() {
}
#[actix_rt::test]
async fn get_unexisting_index_all_documents() {
async fn error_get_unexisting_index_all_documents() {
let server = Server::new().await;
let (_response, code) = server
let (response, code) = server
.index("test")
.get_all_documents(GetAllDocumentsOptions::default())
.await;
let expected_response = json!({
"message": "Index `test` not found.",
"code": "index_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_not_found"
});
assert_eq!(response, expected_response);
assert_eq!(code, 404);
}
#[actix_rt::test]
async fn get_no_documents() {
async fn get_no_document() {
let server = Server::new().await;
let index = server.index("test");
let (_, code) = index.create(None).await;
assert_eq!(code, 201);
assert_eq!(code, 202);
index.wait_task(0).await;
let (response, code) = index
.get_all_documents(GetAllDocumentsOptions::default())

View File

@@ -0,0 +1,22 @@
#![allow(dead_code)]
mod common;
use crate::common::Server;
use serde_json::json;
#[actix_rt::test]
async fn get_unexisting_dump_status() {
let server = Server::new().await;
let (response, code) = server.get_dump_status("foobar").await;
assert_eq!(code, 404);
let expected_response = json!({
"message": "Dump `foobar` not found.",
"code": "dump_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#dump_not_found"
});
assert_eq!(response, expected_response);
}

View File

@@ -1,5 +1,5 @@
use crate::common::Server;
use serde_json::Value;
use serde_json::{json, Value};
#[actix_rt::test]
async fn create_index_no_primary_key() {
@@ -7,14 +7,15 @@ async fn create_index_no_primary_key() {
let index = server.index("test");
let (response, code) = index.create(None).await;
assert_eq!(code, 201);
assert_eq!(response["uid"], "test");
assert_eq!(response["name"], "test");
assert!(response.get("createdAt").is_some());
assert!(response.get("updatedAt").is_some());
assert_eq!(response["createdAt"], response["updatedAt"]);
assert_eq!(response["primaryKey"], Value::Null);
assert_eq!(response.as_object().unwrap().len(), 5);
assert_eq!(code, 202);
assert_eq!(response["status"], "enqueued");
let response = index.wait_task(0).await;
assert_eq!(response["status"], "succeeded");
assert_eq!(response["type"], "indexCreation");
assert_eq!(response["details"]["primaryKey"], Value::Null);
}
#[actix_rt::test]
@@ -23,36 +24,31 @@ async fn create_index_with_primary_key() {
let index = server.index("test");
let (response, code) = index.create(Some("primary")).await;
assert_eq!(code, 201);
assert_eq!(response["uid"], "test");
assert_eq!(response["name"], "test");
assert!(response.get("createdAt").is_some());
assert!(response.get("updatedAt").is_some());
//assert_eq!(response["createdAt"], response["updatedAt"]);
assert_eq!(response["primaryKey"], "primary");
assert_eq!(response.as_object().unwrap().len(), 5);
}
assert_eq!(code, 202);
// TODO: partial test since we are testing error, amd error is not yet fully implemented in
// transplant
#[actix_rt::test]
async fn create_existing_index() {
let server = Server::new().await;
let index = server.index("test");
let (_, code) = index.create(Some("primary")).await;
assert_eq!(response["status"], "enqueued");
assert_eq!(code, 201);
let response = index.wait_task(0).await;
let (_response, code) = index.create(Some("primary")).await;
assert_eq!(code, 400);
assert_eq!(response["status"], "succeeded");
assert_eq!(response["type"], "indexCreation");
assert_eq!(response["details"]["primaryKey"], "primary");
}
#[actix_rt::test]
async fn create_with_invalid_index_uid() {
async fn create_index_with_invalid_primary_key() {
let document = json!([ { "id": 2, "title": "Pride and Prejudice" } ]);
let server = Server::new().await;
let index = server.index("test test#!");
let (_, code) = index.create(None).await;
assert_eq!(code, 400);
let index = server.index("movies");
let (_response, code) = index.add_documents(document, Some("title")).await;
assert_eq!(code, 202);
index.wait_task(0).await;
let (response, code) = index.get().await;
assert_eq!(code, 200);
assert_eq!(response["primaryKey"], Value::Null);
}
#[actix_rt::test]
@@ -67,8 +63,51 @@ async fn test_create_multiple_indexes() {
index2.create(None).await;
index3.create(None).await;
index1.wait_task(0).await;
index1.wait_task(1).await;
index1.wait_task(2).await;
assert_eq!(index1.get().await.1, 200);
assert_eq!(index2.get().await.1, 200);
assert_eq!(index3.get().await.1, 200);
assert_eq!(index4.get().await.1, 404);
}
#[actix_rt::test]
async fn error_create_existing_index() {
let server = Server::new().await;
let index = server.index("test");
let (_, code) = index.create(Some("primary")).await;
assert_eq!(code, 202);
index.create(Some("primary")).await;
let response = index.wait_task(1).await;
let expected_response = json!({
"message": "Index `test` already exists.",
"code": "index_already_exists",
"type": "invalid_request",
"link":"https://docs.meilisearch.com/errors#index_already_exists"
});
assert_eq!(response["error"], expected_response);
}
#[actix_rt::test]
async fn error_create_with_invalid_index_uid() {
let server = Server::new().await;
let index = server.index("test test#!");
let (response, code) = index.create(None).await;
let expected_response = json!({
"message": "`test test#!` is not a valid index uid. Index uid can be an integer or a string containing only alphanumeric characters, hyphens (-) and underscores (_).",
"code": "invalid_index_uid",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid_index_uid"
});
assert_eq!(response, expected_response);
assert_eq!(code, 400);
}

View File

@@ -8,33 +8,59 @@ async fn create_and_delete_index() {
let index = server.index("test");
let (_response, code) = index.create(None).await;
assert_eq!(code, 201);
assert_eq!(code, 202);
index.wait_task(0).await;
assert_eq!(index.get().await.1, 200);
let (_response, code) = index.delete().await;
assert_eq!(code, 204);
assert_eq!(code, 202);
index.wait_task(1).await;
assert_eq!(index.get().await.1, 404);
}
#[actix_rt::test]
async fn delete_unexisting_index() {
async fn error_delete_unexisting_index() {
let server = Server::new().await;
let index = server.index("test");
let (_response, code) = index.delete().await;
let (_, code) = index.delete().await;
assert_eq!(code, 404);
assert_eq!(code, 202);
let expected_response = json!({
"message": "Index `test` not found.",
"code": "index_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_not_found"
});
let response = index.wait_task(0).await;
assert_eq!(response["status"], "failed");
assert_eq!(response["error"], expected_response);
}
#[cfg(not(windows))]
#[actix_rt::test]
async fn loop_delete_add_documents() {
let server = Server::new().await;
let index = server.index("test");
let documents = json!([{"id": 1, "field1": "hello"}]);
let mut tasks = Vec::new();
for _ in 0..50 {
let (response, code) = index.add_documents(documents.clone(), None).await;
tasks.push(response["uid"].as_u64().unwrap());
assert_eq!(code, 202, "{}", response);
let (response, code) = index.delete().await;
assert_eq!(code, 204, "{}", response);
tasks.push(response["uid"].as_u64().unwrap());
assert_eq!(code, 202, "{}", response);
}
for task in tasks {
let response = index.wait_task(task).await;
assert_eq!(response["status"], "succeeded", "{}", response);
}
}

Some files were not shown because too many files have changed in this diff Show More