Commit Graph

3174 Commits

Author SHA1 Message Date
53fc2edab3 Merge #1897
1897: Add ARM image for Docker to CI r=irevoire a=curquiza

Fixes #1315 

- [x] Publish MeiliSearch's docker image for `arm64`
- [x] Add `workflow_dispatch` event in case we need to re-trigger it after a failure without creating a new release
- [x] Use our own server to run the github runner since this CI is really slow (1h instead of 4h)
- [x] Open an issue for a refactor by merging both files in one file (https://github.com/meilisearch/MeiliSearch/issues/1901)

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-15 16:16:34 +00:00
b7c5b78a61 Remove workflow_dispatch 2021-11-15 14:13:40 +01:00
f081dc2001 Remove checkout 2021-11-12 21:31:18 +01:00
2cf7daa227 Use self-hosted runner 2021-11-12 18:49:02 +01:00
9d75fbc619 Fix docker meta job 2021-11-11 19:42:30 +01:00
3b1b9a277b Remove context 2021-11-11 16:38:45 +01:00
40e87b9544 Add checkout 2021-11-11 16:37:01 +01:00
ded7922be5 Add context and platform 2021-11-11 16:30:16 +01:00
11ef64ee43 Fix credentials 2021-11-11 16:02:32 +01:00
5e6d7b7649 Add worflow dispatch event 2021-11-11 16:00:10 +01:00
5fd9616b5f Add ARM image for Docker to CI 2021-11-11 15:58:44 +01:00
15cb4dafa9 Merge #1882
1882: Remove release drafter r=curquiza a=curquiza

Remove release drafter since it's not used at the moment due to the specific release process of MeiliSearch.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-09 14:36:32 +00:00
8ca76d9fdf Merge #1884
1884: Remove Hacktoberfest section from CONTRIBUTING.md r=curquiza a=meili-bot

_This PR is auto-generated._

Remove Hacktoberfest section from CONTRIBUTING.md


Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2021-11-09 13:19:30 +00:00
f62e52ec68 Update CONTRIBUTING.md 2021-11-09 14:15:50 +01:00
bf01c674ea Remove release drafter 2021-11-08 14:49:50 +01:00
ec0716ddd1 Merge #1874
1874: Fix small typo in download-latest.sh r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2021-11-04 09:50:55 +00:00
f7f2421e71 Update download-latest.sh 2021-11-03 17:09:50 +01:00
c32f13a909 Merge #1800
1800: Analytics r=irevoire a=irevoire

Closes #1784
Implements [this spec](https://github.com/meilisearch/specifications/blob/update-analytics-specs/text/0034-telemetry-policies.md) 

# Anonymous Analytics Policy

## 1. Functional Specification

### I. Summary

This specification describes an exhaustive list of anonymous metrics collected by the MeiliSearch binary. It also describes the tools we use for this collection and how we identify a Meilisearch instance.

### II. Motivation

At MeiliSearch, our vision is to provide an easy-to-use search solution that meets the essential needs of our users. At all times, we strive to understand our users better and meet their expectations in the best possible way.

Although we can gather needs and understand our users through several channels such as Github, Slack, surveys, interviews or roadmap votes, we realize that this is not enough to have a complete view of MeiliSearch usage and features adoption. By cross-referencing our product discovery phases with aggregated quantitative data, we want to make the product much better than what it is today. Our decision-making will be taken a step further to make a product that users love.

### III. Explanation

#### General Data Protection Regulation (GDPR)

The metrics collected are non-sensitive, non-personal and do not identify an individual or a group of individuals using MeiliSearch. The data collected is secured and anonymized. We do not collect any data from the values stored in the documents.

We, the MeiliSearch team, provide an email address so that users can request the removal of their data: privacy@meilisearch.com.<br>
Thanks to the unique identifier generated for their MeiliSearch installation (`Instance uuid` when launching MeiliSearch), we can remove the corresponding data from all the tools we describe below. Any questions regarding the management of the data collected can be sent to the email address as well.

#### Tools

##### Segment

The collected data is sent to [Segment](https://segment.com/). Segment is a platform for data collection and provides data management tools.

##### Amplitude

[Amplitude](https://amplitude.com/) is a tool for graphing and highlighting collected data. Segment feeds Amplitude so that we can build visualizations according to our needs.

-----------
# The `identify` call we send every hour:

## System Configuration `system`

This property allows us to gather essential information to better understand on which type of machine MeiliSearch is used. This allows us to better advise users on the machines to choose according to their data volume and their use-cases.

 - [x] `system` => Never changes but still sent every hours
     - [x] distribution | On which distribution MeiliSearch is launched, eg: Arch Linux
     - [x] kernel_version | On which kernel version MeiliSearch is launched, eg: 5.14.10-arch1-1
     - [x] cores | How many cores does the machine have, eg: 24
     - [x] ram_size | Total capacity of the machine's ram. Expressed in `Kb`, eg: 33604210
     - [x] disk_size | Total capacity of the biggest disk. Expressed in `Kb`, eg: 336042103
     - [x] server_provider | Users can tell us on which provider MeiliSearch is hosted by filling the `MEILI_SERVER_PROVIDER` env var. This is also filled by our providers deploy scripts. e.g. GCP [cloud-config.yaml](56a7c2630c/scripts/providers/gcp/cloud-config.yaml (L33)), eg: gcp

## MeiliSearch Configuration

- [x] `context.app.version`: MeiliSearch version, eg: 0.23.0
- [x] `env`: `production` / `development`, eg: `production`
- [x] `has_snapshot`: Does the MeiliSearch instance has snapshot activated, eg: `true`

## MeiliSearch Statistics `stats`

 - [x] `stats`
     - [x] `database_size`: Size of indexed data. Expressed in `Kb`, eg: 180230
     - [x] `indexes_number`: Number of indexes, eg: 2
     - [x] `documents_number`: Number of indexed documents, eg: 165847
     - [x] `start_since_days`: How many days ago was the instance launched?, eg: 328

---------

- [x] Launched | This is the first event sent to mark that MeiliSearch is launched a first time

---------

- [x] `Documents Searched POST`: The Documents Searched event is sent once an hour. The event's properties are averaged over all search operations during that time so as not to track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents all the user-agents encountered on this endpoint during one hour, eg: `["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]`
  - [x] `requests`
      - [x] `99th_response_time`: The maximum latency, in ms, for the fastest 99% of requests, eg: `57ms`
      - [x] `total_suceeded`: The total number of succeeded search requests, eg: `3456`
      - [x] `total_failed`: The total number of failed search requests, eg: `24`
      - [x] `total_received`: The total number of received search requests, eg: `3480`
  - [x] `sort`
      - [x] `with_geoPoint`: Does the built-in sort rule _geoPoint rule has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of sort criteria among all the requests containing the sort parameter. "sort": [] equals to 0 while not sending sort does not influence the average, eg: `2`
  - [x] `filter`
      - [x] `with_geoRadius`: Does the built-in filter rule _geoRadius has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of filter criteria among all the requests containing the filter parameter. "filter": [] equals to 0 while not sending filter does not influence the average, eg: `4`
      - [x] `most_used_syntax`: The most used filter syntax among all the requests containing the requests containing the filter parameter. `string` / `array` / `mixed`, `mixed`
  - [x] `q`
      - [x] `avg_terms_number`: The average number of terms for the `q` parameter among all requests, eg: `5`
  - [x] `pagination`:
      - [x] `max_limit`: The maximum limit encountered among all requests, eg: `20`
      - [x] `max_offset`: The maxium offset encountered among all requests, eg: `1000` 

---

- [x] `Documents Searched GET`: The Documents Searched event is sent once an hour. The event's properties are averaged over all search operations during that time so as not to track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents all the user-agents encountered on this endpoint during one hour, eg: `["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]`
  - [x] `requests`
      - [x] `99th_response_time`: The maximum latency, in ms, for the fastest 99% of requests, eg: `57ms`
      - [x] `total_suceeded`: The total number of succeeded search requests, eg: `3456`
      - [x] `total_failed`: The total number of failed search requests, eg: `24`
      - [x] `total_received`: The total number of received search requests, eg: `3480`
  - [x] `sort`
      - [x] `with_geoPoint`: Does the built-in sort rule _geoPoint rule has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of sort criteria among all the requests containing the sort parameter. "sort": [] equals to 0 while not sending sort does not influence the average, eg: `2`
  - [x] `filter`
      - [x] `with_geoRadius`: Does the built-in filter rule _geoRadius has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of filter criteria among all the requests containing the filter parameter. "filter": [] equals to 0 while not sending filter does not influence the average, eg: `4`
      - [x] `most_used_syntax`: The most used filter syntax among all the requests containing the requests containing the filter parameter. `string` / `array` / `mixed`, `mixed`
  - [x] `q`
      - [x] `avg_terms_number`: The average number of terms for the `q` parameter among all requests, eg: `5`
  - [x] `pagination`:
      - [x] `max_limit`: The maximum limit encountered among all requests, eg: `20`
      - [x] `max_offset`: The maxium offset encountered among all requests, eg: `1000` 

---

- [x] `Index Created`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`

---

- [x] `Index Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`

---

- [x] `Documents Added`: The Documents Added event is sent once an hour. The event's properties are averaged over all POST /documents additions operations during that time to not track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `payload_type`: Represents all the `payload_type` encountered on this endpoint during one hour, eg: [`text/csv`]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`
  - [x] `index_creation`: Does an index creation happened, eg: `false`

---

- [x] `Documents Updated`: The Documents Added event is sent once an hour. The event's properties are averaged over all PUT /documents additions operations during that time to not track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `payload_type`: Represents all the `payload_type` encountered on this endpoint during one hour, eg: [`application/json`]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`
  - [x] `index_creation`: Does an index creation happened, eg: `false`

---

- [x] Settings Updated
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `ranking_rules`
      - [x] `sort_position`: Position of the `sort` ranking rule if any, otherwise `null`, eg: `5`
  - [x] `sortable_attributes`
      - [x] `total`: Number of sortable attributes, eg: `3`
      - [x] `has_geo`: Indicate if `_geo` is set as a sortable attribute, eg: `false`
  - [x] `filterable_attributes`
      - [x] `total`: Number of filterable attributes, eg: `3`
      - [x] `has_geo`: Indicate if `_geo` is set as a filterable attribute, eg: `false`

---

- [x] `RankingRules Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `sort_position`: Position of the `sort` ranking rule if any, otherwise `null`, eg: `5`

---

- [x] `SortableAttributes Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `total`: Number of sortable attributes, eg: `3`
  - [x] `has_geo`: Indicate if `_geo` is set as a sortable attribute, eg: `false`

---

- [x] `FilterableAttributes Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `total`: Number of filterable attributes, eg: `3`
  - [x] `has_geo`: Indicate if `_geo` is set as a filterable attribute, eg: `false`

---

- [x] Dump Created
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]

---

Ensure the user-id file is well saved and loaded with:
- [x] the dumps
- [x]  the snapshots



- [x] Ensure the CLI uuid only show if analytics are activate at launch **or already exists** (=even if meilisearch was launched without analytics)

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
v0.24.0rc0
2021-10-29 16:11:03 +00:00
519093ea65 fix bad rebase 2021-10-29 17:32:49 +02:00
bd49d1c4b5 fix one small bug 2021-10-29 17:25:56 +02:00
2665c0099d clippy + fmt 2021-10-29 17:25:56 +02:00
d65f055030 pass anaytics into Arc instead of static ref 2021-10-29 17:25:55 +02:00
66d87761b7 align the parameters in the launche resume 2021-10-29 17:25:55 +02:00
ba69ad672a fix the timing issue 2021-10-29 17:25:55 +02:00
7934e3956b replace all mutexes by channel 2021-10-29 17:25:55 +02:00
68fe93b7db add ranking_rules marker before sort_position 2021-10-29 17:25:55 +02:00
efd0ea9e1e makes clippy happier 2021-10-29 17:25:55 +02:00
6ef73eb226 fix all the single settings route and add the searchable attributes Updated event 2021-10-29 17:25:55 +02:00
fc2f23d36c move the start_since_days to teh root of the identify 2021-10-29 17:25:54 +02:00
7c39fab453 move the user-agent out of the context in every request 2021-10-29 17:25:54 +02:00
c5164c01c0 set the total of sortable attributes and filterable-attributes to 0 when not set 2021-10-29 17:25:54 +02:00
351ad32d77 fix the index_creation boolean 2021-10-29 17:25:54 +02:00
3ad8311bdd split the analytics in a module 2021-10-29 17:25:54 +02:00
ea5ae2bae5 sort the imports 2021-10-29 17:25:54 +02:00
72e3adc55e display an instance-id instead of a user-id 2021-10-29 17:25:54 +02:00
b250392e8d remove the first - in the path to the db instance in the instance-id 2021-10-29 17:25:53 +02:00
d8b0d68840 use a regex to count the number of filters instead of split + flatten 2021-10-29 17:25:53 +02:00
c4737749ab bump segment to be able to display a user 2021-10-29 17:25:53 +02:00
a1ab02f9fb remove some commented code 2021-10-29 17:25:53 +02:00
bba64b32ca async_traits is not needed anymore 2021-10-29 17:25:53 +02:00
9abd2aa9d7 make the analytics interval a const 2021-10-29 17:25:53 +02:00
de35a9a605 use an official release of segment 2021-10-29 17:25:53 +02:00
ed750e8792 fix start_since_day 2021-10-29 17:25:53 +02:00
37ca50832c fix the sort position 2021-10-29 17:25:52 +02:00
31c7a0105b fix a bug on the batch documents function 2021-10-29 17:25:52 +02:00
ddab9eafa1 fix a typo 2021-10-29 17:25:52 +02:00
76a4f86e0c rename user-id to instance-uid 2021-10-29 17:25:52 +02:00
6b34318274 makes clippy happy 2021-10-29 17:25:52 +02:00
5508c6c154 a bit of styling 2021-10-29 17:25:52 +02:00
9a62ac0c94 send the analytics only once every hours 2021-10-29 17:25:52 +02:00