From 6207dc9037601d5188c86876c2ff79d6ddbbe540 Mon Sep 17 00:00:00 2001 From: Achilleas Pipinellis Date: Sun, 25 Sep 2016 12:16:14 +0200 Subject: Move monitoring/ to new location --- doc/monitoring/health_check.md | 67 +------ doc/monitoring/img/health_check_token.png | Bin 6630 -> 0 bytes doc/monitoring/performance/gitlab_configuration.md | 41 +---- .../performance/grafana_configuration.md | 112 +----------- .../performance/influxdb_configuration.md | 194 +-------------------- doc/monitoring/performance/influxdb_schema.md | 98 +---------- doc/monitoring/performance/introduction.md | 66 +------ 7 files changed, 6 insertions(+), 572 deletions(-) delete mode 100644 doc/monitoring/img/health_check_token.png (limited to 'doc/monitoring') diff --git a/doc/monitoring/health_check.md b/doc/monitoring/health_check.md index eac57bc3de4..23ae1b17258 100644 --- a/doc/monitoring/health_check.md +++ b/doc/monitoring/health_check.md @@ -1,66 +1 @@ -# Health Check - -> [Introduced][ce-3888] in GitLab 8.8. - -GitLab provides a health check endpoint for uptime monitoring on the `health_check` web -endpoint. The health check reports on the overall system status based on the status of -the database connection, the state of the database migrations, and the ability to write -and access the cache. This endpoint can be provided to uptime monitoring services like -[Pingdom][pingdom], [Nagios][nagios-health], and [NewRelic][newrelic-health]. - -## Access Token - -An access token needs to be provided while accessing the health check endpoint. The current -accepted token can be found on the `admin/health_check` page of your GitLab instance. - -![access token](img/health_check_token.png) - -The access token can be passed as a URL parameter: - -``` -https://gitlab.example.com/health_check.json?token=ACCESS_TOKEN -``` - -or as an HTTP header: - -```bash -curl --header "TOKEN: ACCESS_TOKEN" https://gitlab.example.com/health_check.json -``` - -## Using the Endpoint - -Once you have the access token, health information can be retrieved as plain text, JSON, -or XML using the `health_check` endpoint: - -- `https://gitlab.example.com/health_check?token=ACCESS_TOKEN` -- `https://gitlab.example.com/health_check.json?token=ACCESS_TOKEN` -- `https://gitlab.example.com/health_check.xml?token=ACCESS_TOKEN` - -You can also ask for the status of specific services: - -- `https://gitlab.example.com/health_check/cache.json?token=ACCESS_TOKEN` -- `https://gitlab.example.com/health_check/database.json?token=ACCESS_TOKEN` -- `https://gitlab.example.com/health_check/migrations.json?token=ACCESS_TOKEN` - -For example, the JSON output of the following health check: - -```bash -curl --header "TOKEN: ACCESS_TOKEN" https://gitlab.example.com/health_check.json -``` - -would be like: - -``` -{"healthy":true,"message":"success"} -``` - -## Status - -On failure, the endpoint will return a `500` HTTP status code. On success, the endpoint -will return a valid successful HTTP status code, and a `success` message. Ideally your -uptime monitoring should look for the success message. - -[ce-3888]: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/3888 -[pingdom]: https://www.pingdom.com -[nagios-health]: https://nagios-plugins.org/doc/man/check_http.html -[newrelic-health]: https://docs.newrelic.com/docs/alerts/alert-policies/downtime-alerts/availability-monitoring +This document was moved to [administration/monitoring/health_check](../administration/monitoring/health_check.md). diff --git a/doc/monitoring/img/health_check_token.png b/doc/monitoring/img/health_check_token.png deleted file mode 100644 index 2d7c82a65a8..00000000000 Binary files a/doc/monitoring/img/health_check_token.png and /dev/null differ diff --git a/doc/monitoring/performance/gitlab_configuration.md b/doc/monitoring/performance/gitlab_configuration.md index 771584268d9..a669bb28904 100644 --- a/doc/monitoring/performance/gitlab_configuration.md +++ b/doc/monitoring/performance/gitlab_configuration.md @@ -1,40 +1 @@ -# GitLab Configuration - -GitLab Performance Monitoring is disabled by default. To enable it and change any of its -settings, navigate to the Admin area in **Settings > Metrics** -(`/admin/application_settings`). - -The minimum required settings you need to set are the InfluxDB host and port. -Make sure _Enable InfluxDB Metrics_ is checked and hit **Save** to save the -changes. - ---- - -![GitLab Performance Monitoring Admin Settings](img/metrics_gitlab_configuration_settings.png) - ---- - -Finally, a restart of all GitLab processes is required for the changes to take -effect: - -```bash -# For Omnibus installations -sudo gitlab-ctl restart - -# For installations from source -sudo service gitlab restart -``` - -## Pending Migrations - -When any migrations are pending, the metrics are disabled until the migrations -have been performed. - ---- - -Read more on: - -- [Introduction to GitLab Performance Monitoring](introduction.md) -- [InfluxDB Configuration](influxdb_configuration.md) -- [InfluxDB Schema](influxdb_schema.md) -- [Grafana Install/Configuration](grafana_configuration.md) +This document was moved to [administration/monitoring/performance/gitlab_configuration](../administration/monitoring/performance/gitlab_configuration.md). diff --git a/doc/monitoring/performance/grafana_configuration.md b/doc/monitoring/performance/grafana_configuration.md index 7947b0fedc4..93320b40174 100644 --- a/doc/monitoring/performance/grafana_configuration.md +++ b/doc/monitoring/performance/grafana_configuration.md @@ -1,111 +1 @@ -# Grafana Configuration - -[Grafana](http://grafana.org/) is a tool that allows you to visualize time -series metrics through graphs and dashboards. It supports several backend -data stores, including InfluxDB. GitLab writes performance data to InfluxDB -and Grafana will allow you to query InfluxDB to display useful graphs. - -For the easiest installation and configuration, install Grafana on the same -server as InfluxDB. For larger installations, you may want to split out these -services. - -## Installation - -Grafana supplies package repositories (Yum/Apt) for easy installation. -See [Grafana installation documentation](http://docs.grafana.org/installation/) -for detailed steps. - -> **Note**: Before starting Grafana for the first time, set the admin user -and password in `/etc/grafana/grafana.ini`. Otherwise, the default password -will be `admin`. - -## Configuration - -Login as the admin user. Expand the menu by clicking the Grafana logo in the -top left corner. Choose 'Data Sources' from the menu. Then, click 'Add new' -in the top bar. - -![Grafana empty data source page](img/grafana_data_source_empty.png) - -Fill in the configuration details for the InfluxDB data source. Save and -Test Connection to ensure the configuration is correct. - -- **Name**: InfluxDB -- **Default**: Checked -- **Type**: InfluxDB 0.9.x (Even if you're using InfluxDB 0.10.x) -- **Url**: https://localhost:8086 (Or the remote URL if you've installed InfluxDB -on a separate server) -- **Access**: proxy -- **Database**: gitlab -- **User**: admin (Or the username configured when setting up InfluxDB) -- **Password**: The password configured when you set up InfluxDB - -![Grafana data source configurations](img/grafana_data_source_configuration.png) - -## Apply retention policies and create continuous queries - -If you intend to import the GitLab provided Grafana dashboards, you will need to -set up the right retention policies and continuous queries. The easiest way of -doing this is by using the [influxdb-management](https://gitlab.com/gitlab-org/influxdb-management) -repository. - -To use this repository you must first clone it: - -``` -git clone https://gitlab.com/gitlab-org/influxdb-management.git -cd influxdb-management -``` - -Next you must install the required dependencies: - -``` -gem install bundler -bundle install -``` - -Now you must configure the repository by first copying `.env.example` to `.env` -and then editing the `.env` file to contain the correct InfluxDB settings. Once -configured you can simply run `bundle exec rake` and the InfluxDB database will -be configured for you. - -For more information see the [influxdb-management README](https://gitlab.com/gitlab-org/influxdb-management/blob/master/README.md). - -## Import Dashboards - -You can now import a set of default dashboards that will give you a good -start on displaying useful information. GitLab has published a set of default -[Grafana dashboards][grafana-dashboards] to get you started. Clone the -repository or download a zip/tarball, then follow these steps to import each -JSON file. - -Open the dashboard dropdown menu and click 'Import' - -![Grafana dashboard dropdown](img/grafana_dashboard_dropdown.png) - -Click 'Choose file' and browse to the location where you downloaded or cloned -the dashboard repository. Pick one of the JSON files to import. - -![Grafana dashboard import](img/grafana_dashboard_import.png) - -Once the dashboard is imported, be sure to click save icon in the top bar. If -you do not save the dashboard after importing it will be removed when you -navigate away. - -![Grafana save icon](img/grafana_save_icon.png) - -Repeat this process for each dashboard you wish to import. - -Alternatively you can automatically import all the dashboards into your Grafana -instance. See the README of the [Grafana dashboards][grafana-dashboards] -repository for more information on this process. - -[grafana-dashboards]: https://gitlab.com/gitlab-org/grafana-dashboards - ---- - -Read more on: - -- [Introduction to GitLab Performance Monitoring](introduction.md) -- [GitLab Configuration](gitlab_configuration.md) -- [InfluxDB Installation/Configuration](influxdb_configuration.md) -- [InfluxDB Schema](influxdb_schema.md) +This document was moved to [administration/monitoring/performance/grafana_configuration](../administration/monitoring/performance/grafana_configuration.md). diff --git a/doc/monitoring/performance/influxdb_configuration.md b/doc/monitoring/performance/influxdb_configuration.md index c30cd2950d8..02647de1eb0 100644 --- a/doc/monitoring/performance/influxdb_configuration.md +++ b/doc/monitoring/performance/influxdb_configuration.md @@ -1,193 +1 @@ -# InfluxDB Configuration - -The default settings provided by [InfluxDB] are not sufficient for a high traffic -GitLab environment. The settings discussed in this document are based on the -settings GitLab uses for GitLab.com, depending on your own needs you may need to -further adjust them. - -If you are intending to run InfluxDB on the same server as GitLab, make sure -you have plenty of RAM since InfluxDB can use quite a bit depending on traffic. - -Unless you are going with a budget setup, it's advised to run it separately. - -## Requirements - -- InfluxDB 0.9.5 or newer -- A fairly modern version of Linux -- At least 4GB of RAM -- At least 10GB of storage for InfluxDB data - -Note that the RAM and storage requirements can differ greatly depending on the -amount of data received/stored. To limit the amount of stored data users can -look into [InfluxDB Retention Policies][influxdb-retention]. - -## Installation - -Installing InfluxDB is out of the scope of this document. Please refer to the -[InfluxDB documentation]. - -## InfluxDB Server Settings - -Since InfluxDB has many settings that users may wish to customize themselves -(e.g. what port to run InfluxDB on), we'll only cover the essentials. - -The configuration file in question is usually located at -`/etc/influxdb/influxdb.conf`. Whenever you make a change in this file, -InfluxDB needs to be restarted. - -### Storage Engine - -InfluxDB comes with different storage engines and as of InfluxDB 0.9.5 a new -storage engine is available, called [TSM Tree]. All users **must** use the new -`tsm1` storage engine as this [will be the default engine][tsm1-commit] in -upcoming InfluxDB releases. - -Make sure you have the following in your configuration file: - -``` -[data] - dir = "/var/lib/influxdb/data" - engine = "tsm1" -``` - -### Admin Panel - -Production environments should have the InfluxDB admin panel **disabled**. This -feature can be disabled by adding the following to your InfluxDB configuration -file: - -``` -[admin] - enabled = false -``` - -### HTTP - -HTTP is required when using the [InfluxDB CLI] or other tools such as Grafana, -thus it should be enabled. When enabling make sure to _also_ enable -authentication: - -``` -[http] - enabled = true - auth-enabled = true -``` - -_**Note:** Before you enable authentication, you might want to [create an -admin user](#create-a-new-admin-user)._ - -### UDP - -GitLab writes data to InfluxDB via UDP and thus this must be enabled. Enabling -UDP can be done using the following settings: - -``` -[[udp]] - enabled = true - bind-address = ":8089" - database = "gitlab" - batch-size = 1000 - batch-pending = 5 - batch-timeout = "1s" - read-buffer = 209715200 -``` - -This does the following: - -1. Enable UDP and bind it to port 8089 for all addresses. -2. Store any data received in the "gitlab" database. -3. Define a batch of points to be 1000 points in size and allow a maximum of - 5 batches _or_ flush them automatically after 1 second. -4. Define a UDP read buffer size of 200 MB. - -One of the most important settings here is the UDP read buffer size as if this -value is set too low, packets will be dropped. You must also make sure the OS -buffer size is set to the same value, the default value is almost never enough. - -To set the OS buffer size to 200 MB, on Linux you can run the following command: - -```bash -sysctl -w net.core.rmem_max=209715200 -``` - -To make this permanent, add the following to `/etc/sysctl.conf` and restart the -server: - -```bash -net.core.rmem_max=209715200 -``` - -It is **very important** to make sure the buffer sizes are large enough to -handle all data sent to InfluxDB as otherwise you _will_ lose data. The above -buffer sizes are based on the traffic for GitLab.com. Depending on the amount of -traffic, users may be able to use a smaller buffer size, but we highly recommend -using _at least_ 100 MB. - -When enabling UDP, users should take care to not expose the port to the public, -as doing so will allow anybody to write data into your InfluxDB database (as -[InfluxDB's UDP protocol][udp] doesn't support authentication). We recommend either -whitelisting the allowed IP addresses/ranges, or setting up a VLAN and only -allowing traffic from members of said VLAN. - -## Create a new admin user - -If you want to [enable authentication](#http), you might want to [create an -admin user][influx-admin]: - -``` -influx -execute "CREATE USER jeff WITH PASSWORD '1234' WITH ALL PRIVILEGES" -``` - -## Create the `gitlab` database - -Once you get InfluxDB up and running, you need to create a database for GitLab. -Make sure you have changed the [storage engine](#storage-engine) to `tsm1` -before creating a database. - -_**Note:** If you [created an admin user](#create-a-new-admin-user) and enabled -[HTTP authentication](#http), remember to append the username (`-username `) -and password (`-password `) you set earlier to the commands below._ - -Run the following command to create a database named `gitlab`: - -```bash -influx -execute 'CREATE DATABASE gitlab' -``` - -The name **must** be `gitlab`, do not use any other name. - -Next, make sure that the database was successfully created: - -```bash -influx -execute 'SHOW DATABASES' -``` - -The output should be similar to: - -``` -name: databases ---------------- -name -_internal -gitlab -``` - -That's it! Now your GitLab instance should send data to InfluxDB. - ---- - -Read more on: - -- [Introduction to GitLab Performance Monitoring](introduction.md) -- [GitLab Configuration](gitlab_configuration.md) -- [InfluxDB Schema](influxdb_schema.md) -- [Grafana Install/Configuration](grafana_configuration.md) - -[influxdb-retention]: https://docs.influxdata.com/influxdb/v0.9/query_language/database_management/#retention-policy-management -[influxdb documentation]: https://docs.influxdata.com/influxdb/v0.9/ -[influxdb cli]: https://docs.influxdata.com/influxdb/v0.9/tools/shell/ -[udp]: https://docs.influxdata.com/influxdb/v0.9/write_protocols/udp/ -[influxdb]: https://influxdata.com/time-series-platform/influxdb/ -[tsm tree]: https://influxdata.com/blog/new-storage-engine-time-structured-merge-tree/ -[tsm1-commit]: https://github.com/influxdata/influxdb/commit/15d723dc77651bac83e09e2b1c94be480966cb0d -[influx-admin]: https://docs.influxdata.com/influxdb/v0.9/administration/authentication_and_authorization/#create-a-new-admin-user +This document was moved to [administration/monitoring/performance/influxdb_configuration](../administration/monitoring/performance/influxdb_configuration.md). diff --git a/doc/monitoring/performance/influxdb_schema.md b/doc/monitoring/performance/influxdb_schema.md index eff0e29f58d..a989e323e04 100644 --- a/doc/monitoring/performance/influxdb_schema.md +++ b/doc/monitoring/performance/influxdb_schema.md @@ -1,97 +1 @@ -# InfluxDB Schema - -The following measurements are currently stored in InfluxDB: - -- `PROCESS_file_descriptors` -- `PROCESS_gc_statistics` -- `PROCESS_memory_usage` -- `PROCESS_method_calls` -- `PROCESS_object_counts` -- `PROCESS_transactions` -- `PROCESS_views` -- `events` - -Here, `PROCESS` is replaced with either `rails` or `sidekiq` depending on the -process type. In all series, any form of duration is stored in milliseconds. - -## PROCESS_file_descriptors - -This measurement contains the number of open file descriptors over time. The -value field `value` contains the number of descriptors. - -## PROCESS_gc_statistics - -This measurement contains Ruby garbage collection statistics such as the amount -of minor/major GC runs (relative to the last sampling interval), the time spent -in garbage collection cycles, and all fields/values returned by `GC.stat`. - -## PROCESS_memory_usage - -This measurement contains the process' memory usage (in bytes) over time. The -value field `value` contains the number of bytes. - -## PROCESS_method_calls - -This measurement contains the methods called during a transaction along with -their duration, and a name of the transaction action that invoked the method (if -available). The method call duration is stored in the value field `duration`, -while the method name is stored in the tag `method`. The tag `action` contains -the full name of the transaction action. Both the `method` and `action` fields -are in the following format: - -``` -ClassName#method_name -``` - -For example, a method called by the `show` method in the `UsersController` class -would have `action` set to `UsersController#show`. - -## PROCESS_object_counts - -This measurement is used to store retained Ruby objects (per class) and the -amount of retained objects. The number of objects is stored in the `count` value -field while the class name is stored in the `type` tag. - -## PROCESS_transactions - -This measurement is used to store basic transaction details such as the time it -took to complete a transaction, how much time was spent in SQL queries, etc. The -following value fields are available: - -| Value | Description | -| ----- | ----------- | -| `duration` | The total duration of the transaction | -| `allocated_memory` | The amount of bytes allocated while the transaction was running. This value is only reliable when using single-threaded application servers | -| `method_duration` | The total time spent in method calls | -| `sql_duration` | The total time spent in SQL queries | -| `view_duration` | The total time spent in views | - -## PROCESS_views - -This measurement is used to store view rendering timings for a transaction. The -following value fields are available: - -| Value | Description | -| ----- | ----------- | -| `duration` | The rendering time of the view | -| `view` | The path of the view, relative to the application's root directory | - -The `action` tag contains the action name of the transaction that rendered the -view. - -## events - -This measurement is used to store generic events such as the number of Git -pushes, Emails sent, etc. Each point in this measurement has a single value -field called `count`. The value of this field is simply set to `1`. Each point -also has at least one tag: `event`. This tag's value is set to the event name. -Depending on the event type additional tags may be available as well. - ---- - -Read more on: - -- [Introduction to GitLab Performance Monitoring](introduction.md) -- [GitLab Configuration](gitlab_configuration.md) -- [InfluxDB Configuration](influxdb_configuration.md) -- [Grafana Install/Configuration](grafana_configuration.md) +This document was moved to [administration/monitoring/performance/influxdb_schema](../administration/monitoring/performance/influxdb_schema.md). diff --git a/doc/monitoring/performance/introduction.md b/doc/monitoring/performance/introduction.md index 79904916b7e..ab3f3ac1664 100644 --- a/doc/monitoring/performance/introduction.md +++ b/doc/monitoring/performance/introduction.md @@ -1,65 +1 @@ -# GitLab Performance Monitoring - -GitLab comes with its own application performance measuring system as of GitLab -8.4, simply called "GitLab Performance Monitoring". GitLab Performance Monitoring is available in both the -Community and Enterprise editions. - -Apart from this introduction, you are advised to read through the following -documents in order to understand and properly configure GitLab Performance Monitoring: - -- [GitLab Configuration](gitlab_configuration.md) -- [InfluxDB Install/Configuration](influxdb_configuration.md) -- [InfluxDB Schema](influxdb_schema.md) -- [Grafana Install/Configuration](grafana_configuration.md) - -## Introduction to GitLab Performance Monitoring - -GitLab Performance Monitoring makes it possible to measure a wide variety of statistics -including (but not limited to): - -- The time it took to complete a transaction (a web request or Sidekiq job). -- The time spent in running SQL queries and rendering HAML views. -- The time spent executing (instrumented) Ruby methods. -- Ruby object allocations, and retained objects in particular. -- System statistics such as the process' memory usage and open file descriptors. -- Ruby garbage collection statistics. - -Metrics data is written to [InfluxDB][influxdb] over [UDP][influxdb-udp]. Stored -data can be visualized using [Grafana][grafana] or any other application that -supports reading data from InfluxDB. Alternatively data can be queried using the -InfluxDB CLI. - -## Metric Types - -Two types of metrics are collected: - -1. Transaction specific metrics. -1. Sampled metrics, collected at a certain interval in a separate thread. - -### Transaction Metrics - -Transaction metrics are metrics that can be associated with a single -transaction. This includes statistics such as the transaction duration, timings -of any executed SQL queries, time spent rendering HAML views, etc. These metrics -are collected for every Rack request and Sidekiq job processed. - -### Sampled Metrics - -Sampled metrics are metrics that can't be associated with a single transaction. -Examples include garbage collection statistics and retained Ruby objects. These -metrics are collected at a regular interval. This interval is made up out of two -parts: - -1. A user defined interval. -1. A randomly generated offset added on top of the interval, the same offset - can't be used twice in a row. - -The actual interval can be anywhere between a half of the defined interval and a -half above the interval. For example, for a user defined interval of 15 seconds -the actual interval can be anywhere between 7.5 and 22.5. The interval is -re-generated for every sampling run instead of being generated once and re-used -for the duration of the process' lifetime. - -[influxdb]: https://influxdata.com/time-series-platform/influxdb/ -[influxdb-udp]: https://docs.influxdata.com/influxdb/v0.9/write_protocols/udp/ -[grafana]: http://grafana.org/ +This document was moved to [administration/monitoring/performance/introduction](../administration/monitoring/performance/introduction.md). -- cgit v1.2.3