diff options
Diffstat (limited to 'doc/administration/redis')
-rw-r--r-- | doc/administration/redis/index.md | 42 | ||||
-rw-r--r-- | doc/administration/redis/replication_and_failover.md | 740 | ||||
-rw-r--r-- | doc/administration/redis/replication_and_failover_external.md | 376 | ||||
-rw-r--r-- | doc/administration/redis/standalone.md | 63 | ||||
-rw-r--r-- | doc/administration/redis/troubleshooting.md | 158 |
5 files changed, 1379 insertions, 0 deletions
diff --git a/doc/administration/redis/index.md b/doc/administration/redis/index.md new file mode 100644 index 00000000000..0bd56666ab8 --- /dev/null +++ b/doc/administration/redis/index.md @@ -0,0 +1,42 @@ +--- +type: index +stage: Enablement +group: Distribution +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Configuring Redis for scaling + +Based on your infrastructure setup and how you have installed GitLab, there are +multiple ways to configure Redis. + +You can choose to install and manage Redis and Sentinel yourself, use a hosted +cloud solution, or you can use the ones that come bundled with the Omnibus GitLab +packages so you only need to focus on configuration. Pick the one that suits your needs. + +## Redis replication and failover using Omnibus GitLab + +This setup is for when you have installed GitLab using the +[Omnibus GitLab **Enterprise Edition** (EE) package](https://about.gitlab.com/install/?version=ee). + +Both Redis and Sentinel are bundled in the package, so you can it to set up the whole +Redis infrastructure (primary, replica and sentinel). + +[> Read how to set up Redis replication and failover using Omnibus GitLab](replication_and_failover.md) + +## Redis replication and failover using the non-bundled Redis + +This setup is for when you have installed GitLab using the +[Omnibus GitLab packages](https://about.gitlab.com/install/) (CE or EE), +or installed it [from source](../../install/installation.md), but you want to use +your own external Redis and sentinel servers. + +[> Read how to set up Redis replication and failover using the non-bundled Redis](replication_and_failover_external.md) + +## Standalone Redis using Omnibus GitLab + +This setup is for when you have installed the +[Omnibus GitLab **Community Edition** (CE) package](https://about.gitlab.com/install/?version=ce) +to use the bundled Redis, so you can use the package with only the Redis service enabled. + +[> Read how to set up a standalone Redis instance using Omnibus GitLab](standalone.md) diff --git a/doc/administration/redis/replication_and_failover.md b/doc/administration/redis/replication_and_failover.md new file mode 100644 index 00000000000..d95320b6669 --- /dev/null +++ b/doc/administration/redis/replication_and_failover.md @@ -0,0 +1,740 @@ +--- +type: howto +stage: Enablement +group: Distribution +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Redis replication and failover with Omnibus GitLab **(PREMIUM ONLY)** + +NOTE: **Note:** +This is the documentation for the Omnibus GitLab packages. For using your own +non-bundled Redis, follow the [relevant documentation](replication_and_failover_external.md). + +NOTE: **Note:** +In Redis lingo, primary is called master. In this document, primary is used +instead of master, except the settings where `master` is required. + +Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica** +topology with a [Redis Sentinel](https://redis.io/topics/sentinel) service to watch and automatically +start the failover procedure. + +Redis requires authentication if used with Sentinel. See +[Redis Security](https://redis.io/topics/security) documentation for more +information. We recommend using a combination of a Redis password and tight +firewall rules to secure your Redis service. +You are highly encouraged to read the [Redis Sentinel](https://redis.io/topics/sentinel) documentation +before configuring Redis with GitLab to fully understand the topology and +architecture. + +Before diving into the details of setting up Redis and Redis Sentinel for a +replicated topology, make sure you read this document once as a whole to better +understand how the components are tied together. + +You need at least `3` independent machines: physical, or VMs running into +distinct physical machines. It is essential that all primary and replica Redis +instances run in different machines. If you fail to provision the machines in +that specific way, any issue with the shared environment can bring your entire +setup down. + +It is OK to run a Sentinel alongside of a primary or replica Redis instance. +There should be no more than one Sentinel on the same machine though. + +You also need to take into consideration the underlying network topology, +making sure you have redundant connectivity between Redis / Sentinel and +GitLab instances, otherwise the networks will become a single point of +failure. + +Running Redis in a scaled environment requires a few things: + +- Multiple Redis instances +- Run Redis in a **Primary** x **Replica** topology +- Multiple Sentinel instances +- Application support and visibility to all Sentinel and Redis instances + +Redis Sentinel can handle the most important tasks in an HA environment and that's +to help keep servers online with minimal to no downtime. Redis Sentinel: + +- Monitors **Primary** and **Replicas** instances to see if they are available +- Promotes a **Replica** to **Primary** when the **Primary** fails +- Demotes a **Primary** to **Replica** when the failed **Primary** comes back online + (to prevent data-partitioning) +- Can be queried by the application to always connect to the current **Primary** + server + +When a **Primary** fails to respond, it's the application's responsibility +(in our case GitLab) to handle timeout and reconnect (querying a **Sentinel** +for a new **Primary**). + +To get a better understanding on how to correctly set up Sentinel, please read +the [Redis Sentinel documentation](https://redis.io/topics/sentinel) first, as +failing to configure it correctly can lead to data loss or can bring your +whole cluster down, invalidating the failover effort. + +## Recommended setup + +For a minimal setup, you will install the Omnibus GitLab package in `3` +**independent** machines, both with **Redis** and **Sentinel**: + +- Redis Primary + Sentinel +- Redis Replica + Sentinel +- Redis Replica + Sentinel + +If you are not sure or don't understand why and where the amount of nodes come +from, read [Redis setup overview](#redis-setup-overview) and +[Sentinel setup overview](#sentinel-setup-overview). + +For a recommended setup that can resist more failures, you will install +the Omnibus GitLab package in `5` **independent** machines, both with +**Redis** and **Sentinel**: + +- Redis Primary + Sentinel +- Redis Replica + Sentinel +- Redis Replica + Sentinel +- Redis Replica + Sentinel +- Redis Replica + Sentinel + +### Redis setup overview + +You must have at least `3` Redis servers: `1` primary, `2` Replicas, and they +need to each be on independent machines (see explanation above). + +You can have additional Redis nodes, that will help survive a situation +where more nodes goes down. Whenever there is only `2` nodes online, a failover +will not be initiated. + +As an example, if you have `6` Redis nodes, a maximum of `3` can be +simultaneously down. + +Please note that there are different requirements for Sentinel nodes. +If you host them in the same Redis machines, you may need to take +that restrictions into consideration when calculating the amount of +nodes to be provisioned. See [Sentinel setup overview](#sentinel-setup-overview) +documentation for more information. + +All Redis nodes should be configured the same way and with similar server specs, as +in a failover situation, any **Replica** can be promoted as the new **Primary** by +the Sentinel servers. + +The replication requires authentication, so you need to define a password to +protect all Redis nodes and the Sentinels. They will all share the same +password, and all instances must be able to talk to +each other over the network. + +### Sentinel setup overview + +Sentinels watch both other Sentinels and Redis nodes. Whenever a Sentinel +detects that a Redis node is not responding, it will announce that to the +other Sentinels. They have to reach the **quorum**, that is the minimum amount +of Sentinels that agrees a node is down, in order to be able to start a failover. + +Whenever the **quorum** is met, the **majority** of all known Sentinel nodes +need to be available and reachable, so that they can elect the Sentinel **leader** +who will take all the decisions to restore the service availability by: + +- Promoting a new **Primary** +- Reconfiguring the other **Replicas** and make them point to the new **Primary** +- Announce the new **Primary** to every other Sentinel peer +- Reconfigure the old **Primary** and demote to **Replica** when it comes back online + +You must have at least `3` Redis Sentinel servers, and they need to +be each in an independent machine (that are believed to fail independently), +ideally in different geographical areas. + +You can configure them in the same machines where you've configured the other +Redis servers, but understand that if a whole node goes down, you loose both +a Sentinel and a Redis instance. + +The number of sentinels should ideally always be an **odd** number, for the +consensus algorithm to be effective in the case of a failure. + +In a `3` nodes topology, you can only afford `1` Sentinel node going down. +Whenever the **majority** of the Sentinels goes down, the network partition +protection prevents destructive actions and a failover **will not be started**. + +Here are some examples: + +- With `5` or `6` sentinels, a maximum of `2` can go down for a failover begin. +- With `7` sentinels, a maximum of `3` nodes can go down. + +The **Leader** election can sometimes fail the voting round when **consensus** +is not achieved (see the odd number of nodes requirement above). In that case, +a new attempt will be made after the amount of time defined in +`sentinel['failover_timeout']` (in milliseconds). + +NOTE: **Note:** +We will see where `sentinel['failover_timeout']` is defined later. + +The `failover_timeout` variable has a lot of different use cases. According to +the official documentation: + +- The time needed to re-start a failover after a previous failover was + already tried against the same primary by a given Sentinel, is two + times the failover timeout. + +- The time needed for a replica replicating to a wrong primary according + to a Sentinel current configuration, to be forced to replicate + with the right primary, is exactly the failover timeout (counting since + the moment a Sentinel detected the misconfiguration). + +- The time needed to cancel a failover that is already in progress but + did not produced any configuration change (REPLICAOF NO ONE yet not + acknowledged by the promoted replica). + +- The maximum time a failover in progress waits for all the replicas to be + reconfigured as replicas of the new primary. However even after this time + the replicas will be reconfigured by the Sentinels anyway, but not with + the exact parallel-syncs progression as specified. + +## Configuring Redis + +This is the section where we install and set up the new Redis instances. + +It is assumed that you have installed GitLab and all its components from scratch. +If you already have Redis installed and running, read how to +[switch from a single-machine installation](#switching-from-an-existing-single-machine-installation). + +NOTE: **Note:** +Redis nodes (both primary and replica) will need the same password defined in +`redis['password']`. At any time during a failover the Sentinels can +reconfigure a node and change its status from primary to replica and vice versa. + +### Requirements + +The requirements for a Redis setup are the following: + +1. Provision the minimum required number of instances as specified in the + [recommended setup](#recommended-setup) section. +1. We **Do not** recommend installing Redis or Redis Sentinel in the same machines your + GitLab application is running on as this weakens your HA configuration. You can however opt in to install Redis + and Sentinel in the same machine. +1. All Redis nodes must be able to talk to each other and accept incoming + connections over Redis (`6379`) and Sentinel (`26379`) ports (unless you + change the default ones). +1. The server that hosts the GitLab application must be able to access the + Redis nodes. +1. Protect the nodes from access from external networks ([Internet](https://gitlab.com/gitlab-org/gitlab-foss/uploads/c4cc8cd353604bd80315f9384035ff9e/The_Internet_IT_Crowd.png)), using + firewall. + +### Switching from an existing single-machine installation + +If you already have a single-machine GitLab install running, you will need to +replicate from this machine first, before de-activating the Redis instance +inside it. + +Your single-machine install will be the initial **Primary**, and the `3` others +should be configured as **Replica** pointing to this machine. + +After replication catches up, you will need to stop services in the +single-machine install, to rotate the **Primary** to one of the new nodes. + +Make the required changes in configuration and restart the new nodes again. + +To disable Redis in the single install, edit `/etc/gitlab/gitlab.rb`: + +```ruby +redis['enable'] = false +``` + +If you fail to replicate first, you may loose data (unprocessed background jobs). + +### Step 1. Configuring the primary Redis instance + +1. SSH into the **Primary** Redis server. +1. [Download/install](https://about.gitlab.com/install/) the Omnibus GitLab + package you want using **steps 1 and 2** from the GitLab downloads page. + - Make sure you select the correct Omnibus package, with the same version + and type (Community, Enterprise editions) of your current install. + - Do not complete any other steps on the download page. + +1. Edit `/etc/gitlab/gitlab.rb` and add the contents: + + ```ruby + # Specify server role as 'redis_master_role' + roles ['redis_master_role'] + + # IP address pointing to a local IP that the other machines can reach to. + # You can also set bind to '0.0.0.0' which listen in all interfaces. + # If you really need to bind to an external accessible IP, make + # sure you add extra firewall rules to prevent unauthorized access. + redis['bind'] = '10.0.0.1' + + # Define a port so Redis can listen for TCP requests which will allow other + # machines to connect to it. + redis['port'] = 6379 + + # Set up password authentication for Redis (use the same password in all nodes). + redis['password'] = 'redis-password-goes-here' + ``` + +1. Only the primary GitLab application server should handle migrations. To + prevent database migrations from running on upgrade, add the following + configuration to your `/etc/gitlab/gitlab.rb` file: + + ```ruby + gitlab_rails['auto_migrate'] = false + ``` + +1. [Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. + +NOTE: **Note:** +You can specify multiple roles like sentinel and Redis as: +`roles ['redis_sentinel_role', 'redis_master_role']`. +Read more about [roles](https://docs.gitlab.com/omnibus/roles/). + +### Step 2. Configuring the replica Redis instances + +1. SSH into the **replica** Redis server. +1. [Download/install](https://about.gitlab.com/install/) the Omnibus GitLab + package you want using **steps 1 and 2** from the GitLab downloads page. + - Make sure you select the correct Omnibus package, with the same version + and type (Community, Enterprise editions) of your current install. + - Do not complete any other steps on the download page. + +1. Edit `/etc/gitlab/gitlab.rb` and add the contents: + + ```ruby + # Specify server role as 'redis_replica_role' + roles ['redis_replica_role'] + + # IP address pointing to a local IP that the other machines can reach to. + # You can also set bind to '0.0.0.0' which listen in all interfaces. + # If you really need to bind to an external accessible IP, make + # sure you add extra firewall rules to prevent unauthorized access. + redis['bind'] = '10.0.0.2' + + # Define a port so Redis can listen for TCP requests which will allow other + # machines to connect to it. + redis['port'] = 6379 + + # The same password for Redis authentication you set up for the primary node. + redis['password'] = 'redis-password-goes-here' + + # The IP of the primary Redis node. + redis['master_ip'] = '10.0.0.1' + + # Port of primary Redis server, uncomment to change to non default. Defaults + # to `6379`. + #redis['master_port'] = 6379 + ``` + +1. To prevent reconfigure from running automatically on upgrade, run: + + ```shell + sudo touch /etc/gitlab/skip-auto-reconfigure + ``` + +1. [Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. +1. Go through the steps again for all the other replica nodes. + +NOTE: **Note:** +You can specify multiple roles like sentinel and Redis as: +`roles ['redis_sentinel_role', 'redis_master_role']`. +Read more about [roles](https://docs.gitlab.com/omnibus/roles/). + +These values don't have to be changed again in `/etc/gitlab/gitlab.rb` after +a failover, as the nodes will be managed by the Sentinels, and even after a +`gitlab-ctl reconfigure`, they will get their configuration restored by +the same Sentinels. + +### Step 3. Configuring the Redis Sentinel instances + +NOTE: **Note:** If you are using an external Redis Sentinel instance, be sure +to exclude the `requirepass` parameter from the Sentinel +configuration. This parameter will cause clients to report `NOAUTH +Authentication required.`. [Redis Sentinel 3.2.x does not support +password authentication](https://github.com/antirez/redis/issues/3279). + +Now that the Redis servers are all set up, let's configure the Sentinel +servers. + +If you are not sure if your Redis servers are working and replicating +correctly, please read the [Troubleshooting Replication](troubleshooting.md#troubleshooting-redis-replication) +and fix it before proceeding with Sentinel setup. + +You must have at least `3` Redis Sentinel servers, and they need to +be each in an independent machine. You can configure them in the same +machines where you've configured the other Redis servers. + +With GitLab Enterprise Edition, you can use the Omnibus package to set up +multiple machines with the Sentinel daemon. + +--- + +1. SSH into the server that will host Redis Sentinel. +1. **You can omit this step if the Sentinels will be hosted in the same node as + the other Redis instances.** + + [Download/install](https://about.gitlab.com/install/) the + Omnibus GitLab Enterprise Edition package using **steps 1 and 2** from the + GitLab downloads page. + - Make sure you select the correct Omnibus package, with the same version + the GitLab application is running. + - Do not complete any other steps on the download page. + +1. Edit `/etc/gitlab/gitlab.rb` and add the contents (if you are installing the + Sentinels in the same node as the other Redis instances, some values might + be duplicate below): + + ```ruby + roles ['redis_sentinel_role'] + + # Must be the same in every sentinel node + redis['master_name'] = 'gitlab-redis' + + # The same password for Redis authentication you set up for the primary node. + redis['master_password'] = 'redis-password-goes-here' + + # The IP of the primary Redis node. + redis['master_ip'] = '10.0.0.1' + + # Define a port so Redis can listen for TCP requests which will allow other + # machines to connect to it. + redis['port'] = 6379 + + # Port of primary Redis server, uncomment to change to non default. Defaults + # to `6379`. + #redis['master_port'] = 6379 + + ## Configure Sentinel + sentinel['bind'] = '10.0.0.1' + + # Port that Sentinel listens on, uncomment to change to non default. Defaults + # to `26379`. + # sentinel['port'] = 26379 + + ## Quorum must reflect the amount of voting sentinels it take to start a failover. + ## Value must NOT be greater then the amount of sentinels. + ## + ## The quorum can be used to tune Sentinel in two ways: + ## 1. If a the quorum is set to a value smaller than the majority of Sentinels + ## we deploy, we are basically making Sentinel more sensible to primary failures, + ## triggering a failover as soon as even just a minority of Sentinels is no longer + ## able to talk with the primary. + ## 1. If a quorum is set to a value greater than the majority of Sentinels, we are + ## making Sentinel able to failover only when there are a very large number (larger + ## than majority) of well connected Sentinels which agree about the primary being down.s + sentinel['quorum'] = 2 + + ## Consider unresponsive server down after x amount of ms. + # sentinel['down_after_milliseconds'] = 10000 + + ## Specifies the failover timeout in milliseconds. It is used in many ways: + ## + ## - The time needed to re-start a failover after a previous failover was + ## already tried against the same primary by a given Sentinel, is two + ## times the failover timeout. + ## + ## - The time needed for a replica replicating to a wrong primary according + ## to a Sentinel current configuration, to be forced to replicate + ## with the right primary, is exactly the failover timeout (counting since + ## the moment a Sentinel detected the misconfiguration). + ## + ## - The time needed to cancel a failover that is already in progress but + ## did not produced any configuration change (REPLICAOF NO ONE yet not + ## acknowledged by the promoted replica). + ## + ## - The maximum time a failover in progress waits for all the replica to be + ## reconfigured as replicas of the new primary. However even after this time + ## the replicas will be reconfigured by the Sentinels anyway, but not with + ## the exact parallel-syncs progression as specified. + # sentinel['failover_timeout'] = 60000 + ``` + +1. To prevent database migrations from running on upgrade, run: + + ```shell + sudo touch /etc/gitlab/skip-auto-reconfigure + ``` + + Only the primary GitLab application server should handle migrations. + +1. [Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. +1. Go through the steps again for all the other Sentinel nodes. + +### Step 4. Configuring the GitLab application + +The final part is to inform the main GitLab application server of the Redis +Sentinels servers and authentication credentials. + +You can enable or disable Sentinel support at any time in new or existing +installations. From the GitLab application perspective, all it requires is +the correct credentials for the Sentinel nodes. + +While it doesn't require a list of all Sentinel nodes, in case of a failure, +it needs to access at least one of the listed. + +NOTE: **Note:** +The following steps should be performed in the [GitLab application server](../high_availability/gitlab.md) +which ideally should not have Redis or Sentinels on it for a HA setup. + +1. SSH into the server where the GitLab application is installed. +1. Edit `/etc/gitlab/gitlab.rb` and add/change the following lines: + + ```ruby + ## Must be the same in every sentinel node + redis['master_name'] = 'gitlab-redis' + + ## The same password for Redis authentication you set up for the primary node. + redis['master_password'] = 'redis-password-goes-here' + + ## A list of sentinels with `host` and `port` + gitlab_rails['redis_sentinels'] = [ + {'host' => '10.0.0.1', 'port' => 26379}, + {'host' => '10.0.0.2', 'port' => 26379}, + {'host' => '10.0.0.3', 'port' => 26379} + ] + ``` + +1. [Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. + +### Step 5. Enable Monitoring + +> [Introduced](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/3786) in GitLab 12.0. + +If you enable Monitoring, it must be enabled on **all** Redis servers. + +1. Make sure to collect [`CONSUL_SERVER_NODES`](../postgresql/replication_and_failover.md#consul-information), which are the IP addresses or DNS records of the Consul server nodes, for the next step. Note they are presented as `Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z` + +1. Create/edit `/etc/gitlab/gitlab.rb` and add the following configuration: + + ```ruby + # Enable service discovery for Prometheus + consul['enable'] = true + consul['monitoring_service_discovery'] = true + + # Replace placeholders + # Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z + # with the addresses of the Consul server nodes + consul['configuration'] = { + retry_join: %w(Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z), + } + + # Set the network addresses that the exporters will listen on + node_exporter['listen_address'] = '0.0.0.0:9100' + redis_exporter['listen_address'] = '0.0.0.0:9121' + ``` + +1. Run `sudo gitlab-ctl reconfigure` to compile the configuration. + +## Example of a minimal configuration with 1 primary, 2 replicas and 3 Sentinels + +In this example we consider that all servers have an internal network +interface with IPs in the `10.0.0.x` range, and that they can connect +to each other using these IPs. + +In a real world usage, you would also set up firewall rules to prevent +unauthorized access from other machines and block traffic from the +outside (Internet). + +We will use the same `3` nodes with **Redis** + **Sentinel** topology +discussed in [Redis setup overview](#redis-setup-overview) and +[Sentinel setup overview](#sentinel-setup-overview) documentation. + +Here is a list and description of each **machine** and the assigned **IP**: + +- `10.0.0.1`: Redis primary + Sentinel 1 +- `10.0.0.2`: Redis Replica 1 + Sentinel 2 +- `10.0.0.3`: Redis Replica 2 + Sentinel 3 +- `10.0.0.4`: GitLab application + +Please note that after the initial configuration, if a failover is initiated +by the Sentinel nodes, the Redis nodes will be reconfigured and the **Primary** +will change permanently (including in `redis.conf`) from one node to the other, +until a new failover is initiated again. + +The same thing will happen with `sentinel.conf` that will be overridden after the +initial execution, after any new sentinel node starts watching the **Primary**, +or a failover promotes a different **Primary** node. + +### Example configuration for Redis primary and Sentinel 1 + +In `/etc/gitlab/gitlab.rb`: + +```ruby +roles ['redis_sentinel_role', 'redis_master_role'] +redis['bind'] = '10.0.0.1' +redis['port'] = 6379 +redis['password'] = 'redis-password-goes-here' +redis['master_name'] = 'gitlab-redis' # must be the same in every sentinel node +redis['master_password'] = 'redis-password-goes-here' # the same value defined in redis['password'] in the primary instance +redis['master_ip'] = '10.0.0.1' # ip of the initial primary redis instance +#redis['master_port'] = 6379 # port of the initial primary redis instance, uncomment to change to non default +sentinel['bind'] = '10.0.0.1' +# sentinel['port'] = 26379 # uncomment to change default port +sentinel['quorum'] = 2 +# sentinel['down_after_milliseconds'] = 10000 +# sentinel['failover_timeout'] = 60000 +``` + +[Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. + +### Example configuration for Redis replica 1 and Sentinel 2 + +In `/etc/gitlab/gitlab.rb`: + +```ruby +roles ['redis_sentinel_role', 'redis_replica_role'] +redis['bind'] = '10.0.0.2' +redis['port'] = 6379 +redis['password'] = 'redis-password-goes-here' +redis['master_password'] = 'redis-password-goes-here' +redis['master_ip'] = '10.0.0.1' # IP of primary Redis server +#redis['master_port'] = 6379 # Port of primary Redis server, uncomment to change to non default +redis['master_name'] = 'gitlab-redis' # must be the same in every sentinel node +sentinel['bind'] = '10.0.0.2' +# sentinel['port'] = 26379 # uncomment to change default port +sentinel['quorum'] = 2 +# sentinel['down_after_milliseconds'] = 10000 +# sentinel['failover_timeout'] = 60000 +``` + +[Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. + +### Example configuration for Redis replica 2 and Sentinel 3 + +In `/etc/gitlab/gitlab.rb`: + +```ruby +roles ['redis_sentinel_role', 'redis_replica_role'] +redis['bind'] = '10.0.0.3' +redis['port'] = 6379 +redis['password'] = 'redis-password-goes-here' +redis['master_password'] = 'redis-password-goes-here' +redis['master_ip'] = '10.0.0.1' # IP of primary Redis server +#redis['master_port'] = 6379 # Port of primary Redis server, uncomment to change to non default +redis['master_name'] = 'gitlab-redis' # must be the same in every sentinel node +sentinel['bind'] = '10.0.0.3' +# sentinel['port'] = 26379 # uncomment to change default port +sentinel['quorum'] = 2 +# sentinel['down_after_milliseconds'] = 10000 +# sentinel['failover_timeout'] = 60000 +``` + +[Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. + +### Example configuration for the GitLab application + +In `/etc/gitlab/gitlab.rb`: + +```ruby +redis['master_name'] = 'gitlab-redis' +redis['master_password'] = 'redis-password-goes-here' +gitlab_rails['redis_sentinels'] = [ + {'host' => '10.0.0.1', 'port' => 26379}, + {'host' => '10.0.0.2', 'port' => 26379}, + {'host' => '10.0.0.3', 'port' => 26379} +] +``` + +[Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. + +## Advanced configuration + +Omnibus GitLab configures some things behind the curtains to make the sysadmins' +lives easier. If you want to know what happens underneath keep reading. + +### Running multiple Redis clusters + +GitLab supports running [separate Redis clusters for different persistent +classes](https://docs.gitlab.com/omnibus/settings/redis.html#running-with-multiple-redis-instances): +cache, queues, and shared_state. To make this work with Sentinel: + +1. Set the appropriate variable in `/etc/gitlab/gitlab.rb` for each instance you are using: + + ```ruby + gitlab_rails['redis_cache_instance'] = REDIS_CACHE_URL + gitlab_rails['redis_queues_instance'] = REDIS_QUEUES_URL + gitlab_rails['redis_shared_state_instance'] = REDIS_SHARED_STATE_URL + ``` + + **Note**: Redis URLs should be in the format: `redis://:PASSWORD@SENTINEL_PRIMARY_NAME` + + 1. PASSWORD is the plaintext password for the Redis instance + 1. SENTINEL_PRIMARY_NAME is the Sentinel primary name (e.g. `gitlab-redis-cache`) + +1. Include an array of hashes with host/port combinations, such as the following: + + ```ruby + gitlab_rails['redis_cache_sentinels'] = [ + { host: REDIS_CACHE_SENTINEL_HOST, port: PORT1 }, + { host: REDIS_CACHE_SENTINEL_HOST2, port: PORT2 } + ] + gitlab_rails['redis_queues_sentinels'] = [ + { host: REDIS_QUEUES_SENTINEL_HOST, port: PORT1 }, + { host: REDIS_QUEUES_SENTINEL_HOST2, port: PORT2 } + ] + gitlab_rails['redis_shared_state_sentinels'] = [ + { host: SHARED_STATE_SENTINEL_HOST, port: PORT1 }, + { host: SHARED_STATE_SENTINEL_HOST2, port: PORT2 } + ] + ``` + +1. Note that for each persistence class, GitLab will default to using the + configuration specified in `gitlab_rails['redis_sentinels']` unless + overridden by the settings above. +1. Be sure to include BOTH configuration options for each persistent classes. For example, + if you choose to configure a cache instance, you must specify both `gitlab_rails['redis_cache_instance']` + and `gitlab_rails['redis_cache_sentinels']` for GitLab to generate the proper configuration files. +1. Run `gitlab-ctl reconfigure` + +### Control running services + +In the previous example, we've used `redis_sentinel_role` and +`redis_master_role` which simplifies the amount of configuration changes. + +If you want more control, here is what each one sets for you automatically +when enabled: + +```ruby +## Redis Sentinel Role +redis_sentinel_role['enable'] = true + +# When Sentinel Role is enabled, the following services are also enabled +sentinel['enable'] = true + +# The following services are disabled +redis['enable'] = false +bootstrap['enable'] = false +nginx['enable'] = false +postgresql['enable'] = false +gitlab_rails['enable'] = false +mailroom['enable'] = false + +------- + +## Redis primary/replica Role +redis_master_role['enable'] = true # enable only one of them +redis_replica_role['enable'] = true # enable only one of them + +# When Redis primary or Replica role are enabled, the following services are +# enabled/disabled. Note that if Redis and Sentinel roles are combined, both +# services will be enabled. + +# The following services are disabled +sentinel['enable'] = false +bootstrap['enable'] = false +nginx['enable'] = false +postgresql['enable'] = false +gitlab_rails['enable'] = false +mailroom['enable'] = false + +# For Redis Replica role, also change this setting from default 'true' to 'false': +redis['master'] = false +``` + +You can find the relevant attributes defined in [`gitlab_rails.rb`](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-cookbooks/gitlab/libraries/gitlab_rails.rb). + +## Troubleshooting + +See the [Redis troubleshooting guide](troubleshooting.md). + +## Further reading + +Read more: + +1. [Reference architectures](../reference_architectures/index.md) +1. [Configure the database](../postgresql/replication_and_failover.md) +1. [Configure NFS](../high_availability/nfs.md) +1. [Configure the GitLab application servers](../high_availability/gitlab.md) +1. [Configure the load balancers](../high_availability/load_balancer.md) diff --git a/doc/administration/redis/replication_and_failover_external.md b/doc/administration/redis/replication_and_failover_external.md new file mode 100644 index 00000000000..244b44dd76a --- /dev/null +++ b/doc/administration/redis/replication_and_failover_external.md @@ -0,0 +1,376 @@ +--- +type: howto +stage: Enablement +group: Distribution +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Redis replication and failover providing your own instance **(CORE ONLY)** + +If you’re hosting GitLab on a cloud provider, you can optionally use a managed +service for Redis. For example, AWS offers ElastiCache that runs Redis. + +Alternatively, you may opt to manage your own Redis instance separate from the +Omnibus GitLab package. + +## Requirements + +The following are the requirements for providing your own Redis instance: + +- Redis version 5.0 or higher is recommended, as this is what ships with + Omnibus GitLab packages starting with GitLab 12.7. +- Support for Redis 3.2 is deprecated with GitLab 12.10 and will be completely + removed in GitLab 13.0. +- GitLab 12.0 and later requires Redis version 3.2 or higher. Older Redis + versions do not support an optional count argument to SPOP which is now + required for [Merge Trains](../../ci/merge_request_pipelines/pipelines_for_merged_results/merge_trains/index.md). +- In addition, if Redis 4 or later is available, GitLab makes use of certain + commands like `UNLINK` and `USAGE` which were introduced only in Redis 4. +- Standalone Redis or Redis high availability with Sentinel are supported. Redis + Cluster is not supported. +- Managed Redis from cloud providers such as AWS ElastiCache will work. If these + services support high availability, be sure it is **not** the Redis Cluster type. + +Note the Redis node's IP address or hostname, port, and password (if required). + +## Redis as a managed service in a cloud provider + +1. Set up Redis according to the [requirements](#requirements). +1. Configure the GitLab application servers with the appropriate connection details + for your external Redis service in your `/etc/gitlab/gitlab.rb` file: + + ```ruby + redis['enable'] = false + + gitlab_rails['redis_host'] = 'redis.example.com' + gitlab_rails['redis_port'] = 6379 + + # Required if Redis authentication is configured on the Redis node + gitlab_rails['redis_password'] = 'Redis Password' + ``` + +1. Reconfigure for the changes to take effect: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +## Redis replication and failover with your own Redis servers + +This is the documentation for configuring a scalable Redis setup when +you have installed Redis all by yourself and not using the bundled one that +comes with the Omnibus packages, although using the Omnibus GitLab packages is +highly recommend as we optimize them specifically for GitLab, and we will take +care of upgrading Redis to the latest supported version. + +Note also that you may elect to override all references to +`/home/git/gitlab/config/resque.yml` in accordance with the advanced Redis +settings outlined in +[Configuration Files Documentation](https://gitlab.com/gitlab-org/gitlab/blob/master/config/README.md). + +We cannot stress enough the importance of reading the +[replication and failover](replication_and_failover.md) documentation of the +Omnibus Redis HA as it provides some invaluable information to the configuration +of Redis. Please proceed to read it before going forward with this guide. + +Before proceeding on setting up the new Redis instances, here are some +requirements: + +- All Redis servers in this guide must be configured to use a TCP connection + instead of a socket. To configure Redis to use TCP connections you need to + define both `bind` and `port` in the Redis config file. You can bind to all + interfaces (`0.0.0.0`) or specify the IP of the desired interface + (e.g., one from an internal network). +- Since Redis 3.2, you must define a password to receive external connections + (`requirepass`). +- If you are using Redis with Sentinel, you will also need to define the same + password for the replica password definition (`masterauth`) in the same instance. + +In addition, read the prerequisites as described in the +[Omnibus Redis document](replication_and_failover.md#requirements) since they provide some +valuable information for the general setup. + +### Step 1. Configuring the primary Redis instance + +Assuming that the Redis primary instance IP is `10.0.0.1`: + +1. [Install Redis](../../install/installation.md#7-redis). +1. Edit `/etc/redis/redis.conf`: + + ```conf + ## Define a `bind` address pointing to a local IP that your other machines + ## can reach you. If you really need to bind to an external accessible IP, make + ## sure you add extra firewall rules to prevent unauthorized access: + bind 10.0.0.1 + + ## Define a `port` to force redis to listen on TCP so other machines can + ## connect to it (default port is `6379`). + port 6379 + + ## Set up password authentication (use the same password in all nodes). + ## The password should be defined equal for both `requirepass` and `masterauth` + ## when setting up Redis to use with Sentinel. + requirepass redis-password-goes-here + masterauth redis-password-goes-here + ``` + +1. Restart the Redis service for the changes to take effect. + +### Step 2. Configuring the replica Redis instances + +Assuming that the Redis replica instance IP is `10.0.0.2`: + +1. [Install Redis](../../install/installation.md#7-redis). +1. Edit `/etc/redis/redis.conf`: + + ```conf + ## Define a `bind` address pointing to a local IP that your other machines + ## can reach you. If you really need to bind to an external accessible IP, make + ## sure you add extra firewall rules to prevent unauthorized access: + bind 10.0.0.2 + + ## Define a `port` to force redis to listen on TCP so other machines can + ## connect to it (default port is `6379`). + port 6379 + + ## Set up password authentication (use the same password in all nodes). + ## The password should be defined equal for both `requirepass` and `masterauth` + ## when setting up Redis to use with Sentinel. + requirepass redis-password-goes-here + masterauth redis-password-goes-here + + ## Define `replicaof` pointing to the Redis primary instance with IP and port. + replicaof 10.0.0.1 6379 + ``` + +1. Restart the Redis service for the changes to take effect. +1. Go through the steps again for all the other replica nodes. + +### Step 3. Configuring the Redis Sentinel instances + +Sentinel is a special type of Redis server. It inherits most of the basic +configuration options you can define in `redis.conf`, with specific ones +starting with `sentinel` prefix. + +Assuming that the Redis Sentinel is installed on the same instance as Redis +primary with IP `10.0.0.1` (some settings might overlap with the primary): + +1. [Install Redis Sentinel](https://redis.io/topics/sentinel). +1. Edit `/etc/redis/sentinel.conf`: + + ```conf + ## Define a `bind` address pointing to a local IP that your other machines + ## can reach you. If you really need to bind to an external accessible IP, make + ## sure you add extra firewall rules to prevent unauthorized access: + bind 10.0.0.1 + + ## Define a `port` to force Sentinel to listen on TCP so other machines can + ## connect to it (default port is `6379`). + port 26379 + + ## Set up password authentication (use the same password in all nodes). + ## The password should be defined equal for both `requirepass` and `masterauth` + ## when setting up Redis to use with Sentinel. + requirepass redis-password-goes-here + masterauth redis-password-goes-here + + ## Define with `sentinel auth-pass` the same shared password you have + ## defined for both Redis primary and replicas instances. + sentinel auth-pass gitlab-redis redis-password-goes-here + + ## Define with `sentinel monitor` the IP and port of the Redis + ## primary node, and the quorum required to start a failover. + sentinel monitor gitlab-redis 10.0.0.1 6379 2 + + ## Define with `sentinel down-after-milliseconds` the time in `ms` + ## that an unresponsive server will be considered down. + sentinel down-after-milliseconds gitlab-redis 10000 + + ## Define a value for `sentinel failover_timeout` in `ms`. This has multiple + ## meanings: + ## + ## * The time needed to re-start a failover after a previous failover was + ## already tried against the same primary by a given Sentinel, is two + ## times the failover timeout. + ## + ## * The time needed for a replica replicating to a wrong primary according + ## to a Sentinel current configuration, to be forced to replicate + ## with the right primary, is exactly the failover timeout (counting since + ## the moment a Sentinel detected the misconfiguration). + ## + ## * The time needed to cancel a failover that is already in progress but + ## did not produced any configuration change (REPLICAOF NO ONE yet not + ## acknowledged by the promoted replica). + ## + ## * The maximum time a failover in progress waits for all the replicas to be + ## reconfigured as replicas of the new primary. However even after this time + ## the replicas will be reconfigured by the Sentinels anyway, but not with + ## the exact parallel-syncs progression as specified. + sentinel failover_timeout 30000 + ``` + +1. Restart the Redis service for the changes to take effect. +1. Go through the steps again for all the other Sentinel nodes. + +### Step 4. Configuring the GitLab application + +You can enable or disable Sentinel support at any time in new or existing +installations. From the GitLab application perspective, all it requires is +the correct credentials for the Sentinel nodes. + +While it doesn't require a list of all Sentinel nodes, in case of a failure, +it needs to access at least one of listed ones. + +The following steps should be performed in the [GitLab application server](../high_availability/gitlab.md) +which ideally should not have Redis or Sentinels in the same machine: + +1. Edit `/home/git/gitlab/config/resque.yml` following the example in + [resque.yml.example](https://gitlab.com/gitlab-org/gitlab/blob/master/config/resque.yml.example), and uncomment the Sentinel lines, pointing to + the correct server credentials: + + ```yaml + # resque.yaml + production: + url: redis://:redi-password-goes-here@gitlab-redis/ + sentinels: + - + host: 10.0.0.1 + port: 26379 # point to sentinel, not to redis port + - + host: 10.0.0.2 + port: 26379 # point to sentinel, not to redis port + - + host: 10.0.0.3 + port: 26379 # point to sentinel, not to redis port + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect. + +## Example of minimal configuration with 1 primary, 2 replicas and 3 sentinels + +In this example we consider that all servers have an internal network +interface with IPs in the `10.0.0.x` range, and that they can connect +to each other using these IPs. + +In a real world usage, you would also set up firewall rules to prevent +unauthorized access from other machines, and block traffic from the +outside ([Internet](https://gitlab.com/gitlab-org/gitlab-foss/uploads/c4cc8cd353604bd80315f9384035ff9e/The_Internet_IT_Crowd.png)). + +For this example, **Sentinel 1** will be configured in the same machine as the +**Redis Primary**, **Sentinel 2** and **Sentinel 3** in the same machines as the +**Replica 1** and **Replica 2** respectively. + +Here is a list and description of each **machine** and the assigned **IP**: + +- `10.0.0.1`: Redis Primary + Sentinel 1 +- `10.0.0.2`: Redis Replica 1 + Sentinel 2 +- `10.0.0.3`: Redis Replica 2 + Sentinel 3 +- `10.0.0.4`: GitLab application + +Please note that after the initial configuration, if a failover is initiated +by the Sentinel nodes, the Redis nodes will be reconfigured and the **Primary** +will change permanently (including in `redis.conf`) from one node to the other, +until a new failover is initiated again. + +The same thing will happen with `sentinel.conf` that will be overridden after the +initial execution, after any new sentinel node starts watching the **Primary**, +or a failover promotes a different **Primary** node. + +### Example configuration for Redis primary and Sentinel 1 + +1. In `/etc/redis/redis.conf`: + + ```conf + bind 10.0.0.1 + port 6379 + requirepass redis-password-goes-here + masterauth redis-password-goes-here + ``` + +1. In `/etc/redis/sentinel.conf`: + + ```conf + bind 10.0.0.1 + port 26379 + sentinel auth-pass gitlab-redis redis-password-goes-here + sentinel monitor gitlab-redis 10.0.0.1 6379 2 + sentinel down-after-milliseconds gitlab-redis 10000 + sentinel failover_timeout 30000 + ``` + +1. Restart the Redis service for the changes to take effect. + +### Example configuration for Redis replica 1 and Sentinel 2 + +1. In `/etc/redis/redis.conf`: + + ```conf + bind 10.0.0.2 + port 6379 + requirepass redis-password-goes-here + masterauth redis-password-goes-here + replicaof 10.0.0.1 6379 + ``` + +1. In `/etc/redis/sentinel.conf`: + + ```conf + bind 10.0.0.2 + port 26379 + sentinel auth-pass gitlab-redis redis-password-goes-here + sentinel monitor gitlab-redis 10.0.0.1 6379 2 + sentinel down-after-milliseconds gitlab-redis 10000 + sentinel failover_timeout 30000 + ``` + +1. Restart the Redis service for the changes to take effect. + +### Example configuration for Redis replica 2 and Sentinel 3 + +1. In `/etc/redis/redis.conf`: + + ```conf + bind 10.0.0.3 + port 6379 + requirepass redis-password-goes-here + masterauth redis-password-goes-here + replicaof 10.0.0.1 6379 + ``` + +1. In `/etc/redis/sentinel.conf`: + + ```conf + bind 10.0.0.3 + port 26379 + sentinel auth-pass gitlab-redis redis-password-goes-here + sentinel monitor gitlab-redis 10.0.0.1 6379 2 + sentinel down-after-milliseconds gitlab-redis 10000 + sentinel failover_timeout 30000 + ``` + +1. Restart the Redis service for the changes to take effect. + +### Example configuration of the GitLab application + +1. Edit `/home/git/gitlab/config/resque.yml`: + + ```yaml + production: + url: redis://:redi-password-goes-here@gitlab-redis/ + sentinels: + - + host: 10.0.0.1 + port: 26379 # point to sentinel, not to redis port + - + host: 10.0.0.2 + port: 26379 # point to sentinel, not to redis port + - + host: 10.0.0.3 + port: 26379 # point to sentinel, not to redis port + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect. + +## Troubleshooting + +See the [Redis troubleshooting guide](troubleshooting.md). diff --git a/doc/administration/redis/standalone.md b/doc/administration/redis/standalone.md new file mode 100644 index 00000000000..12e932dbc5e --- /dev/null +++ b/doc/administration/redis/standalone.md @@ -0,0 +1,63 @@ +--- +type: howto +stage: Enablement +group: Distribution +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Standalone Redis using Omnibus GitLab **(CORE ONLY)** + +The Omnibus GitLab package can be used to configure a standalone Redis server. +In this configuration, Redis is not scaled, and represents a single +point of failure. However, in a scaled environment the objective is to allow +the environment to handle more users or to increase throughput. Redis itself +is generally stable and can handle many requests, so it is an acceptable +trade off to have only a single instance. See the [reference architectures](../reference_architectures/index.md) +page for an overview of GitLab scaling options. + +## Set up a standalone Redis instance + +The steps below are the minimum necessary to configure a Redis server with +Omnibus GitLab: + +1. SSH into the Redis server. +1. [Download and install](https://about.gitlab.com/install/) the Omnibus GitLab + package you want by using **steps 1 and 2** from the GitLab downloads page. + Do not complete any other steps on the download page. + +1. Edit `/etc/gitlab/gitlab.rb` and add the contents: + + ```ruby + ## Enable Redis + redis['enable'] = true + + ## Disable all other services + sidekiq['enable'] = false + gitlab_workhorse['enable'] = false + puma['enable'] = false + postgresql['enable'] = false + nginx['enable'] = false + prometheus['enable'] = false + alertmanager['enable'] = false + pgbouncer_exporter['enable'] = false + gitlab_exporter['enable'] = false + gitaly['enable'] = false + + redis['bind'] = '0.0.0.0' + redis['port'] = 6379 + redis['password'] = 'SECRET_PASSWORD_HERE' + + gitlab_rails['enable'] = false + ``` + +1. [Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. +1. Note the Redis node's IP address or hostname, port, and + Redis password. These will be necessary when configuring the GitLab + application servers later. + +[Advanced configuration options](https://docs.gitlab.com/omnibus/settings/redis.html) +are supported and can be added if needed. + +## Troubleshooting + +See the [Redis troubleshooting guide](troubleshooting.md). diff --git a/doc/administration/redis/troubleshooting.md b/doc/administration/redis/troubleshooting.md new file mode 100644 index 00000000000..402b60e5b7b --- /dev/null +++ b/doc/administration/redis/troubleshooting.md @@ -0,0 +1,158 @@ +--- +type: reference +stage: Enablement +group: Distribution +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Troubleshooting Redis + +There are a lot of moving parts that needs to be taken care carefully +in order for the HA setup to work as expected. + +Before proceeding with the troubleshooting below, check your firewall rules: + +- Redis machines + - Accept TCP connection in `6379` + - Connect to the other Redis machines via TCP in `6379` +- Sentinel machines + - Accept TCP connection in `26379` + - Connect to other Sentinel machines via TCP in `26379` + - Connect to the Redis machines via TCP in `6379` + +## Troubleshooting Redis replication + +You can check if everything is correct by connecting to each server using +`redis-cli` application, and sending the `info replication` command as below. + +```shell +/opt/gitlab/embedded/bin/redis-cli -h <redis-host-or-ip> -a '<redis-password>' info replication +``` + +When connected to a `Primary` Redis, you will see the number of connected +`replicas`, and a list of each with connection details: + +```plaintext +# Replication +role:master +connected_replicas:1 +replica0:ip=10.133.5.21,port=6379,state=online,offset=208037514,lag=1 +master_repl_offset:208037658 +repl_backlog_active:1 +repl_backlog_size:1048576 +repl_backlog_first_byte_offset:206989083 +repl_backlog_histlen:1048576 +``` + +When it's a `replica`, you will see details of the primary connection and if +its `up` or `down`: + +```plaintext +# Replication +role:replica +master_host:10.133.1.58 +master_port:6379 +master_link_status:up +master_last_io_seconds_ago:1 +master_sync_in_progress:0 +replica_repl_offset:208096498 +replica_priority:100 +replica_read_only:1 +connected_replicas:0 +master_repl_offset:0 +repl_backlog_active:0 +repl_backlog_size:1048576 +repl_backlog_first_byte_offset:0 +repl_backlog_histlen:0 +``` + +## Troubleshooting Sentinel + +If you get an error like: `Redis::CannotConnectError: No sentinels available.`, +there may be something wrong with your configuration files or it can be related +to [this issue](https://github.com/redis/redis-rb/issues/531). + +You must make sure you are defining the same value in `redis['master_name']` +and `redis['master_pasword']` as you defined for your sentinel node. + +The way the Redis connector `redis-rb` works with sentinel is a bit +non-intuitive. We try to hide the complexity in omnibus, but it still requires +a few extra configurations. + +--- + +To make sure your configuration is correct: + +1. SSH into your GitLab application server +1. Enter the Rails console: + + ```shell + # For Omnibus installations + sudo gitlab-rails console + + # For source installations + sudo -u git rails console -e production + ``` + +1. Run in the console: + + ```ruby + redis = Redis.new(Gitlab::Redis::SharedState.params) + redis.info + ``` + + Keep this screen open and try to simulate a failover below. + +1. To simulate a failover on primary Redis, SSH into the Redis server and run: + + ```shell + # port must match your primary redis port, and the sleep time must be a few seconds bigger than defined one + redis-cli -h localhost -p 6379 DEBUG sleep 20 + ``` + +1. Then back in the Rails console from the first step, run: + + ```ruby + redis.info + ``` + + You should see a different port after a few seconds delay + (the failover/reconnect time). + +## Troubleshooting a non-bundled Redis with an installation from source + +If you get an error in GitLab like `Redis::CannotConnectError: No sentinels available.`, +there may be something wrong with your configuration files or it can be related +to [this upstream issue](https://github.com/redis/redis-rb/issues/531). + +You must make sure that `resque.yml` and `sentinel.conf` are configured correctly, +otherwise `redis-rb` will not work properly. + +The `master-group-name` (`gitlab-redis`) defined in (`sentinel.conf`) +**must** be used as the hostname in GitLab (`resque.yml`): + +```conf +# sentinel.conf: +sentinel monitor gitlab-redis 10.0.0.1 6379 2 +sentinel down-after-milliseconds gitlab-redis 10000 +sentinel config-epoch gitlab-redis 0 +sentinel leader-epoch gitlab-redis 0 +``` + +```yaml +# resque.yaml +production: + url: redis://:myredispassword@gitlab-redis/ + sentinels: + - + host: 10.0.0.1 + port: 26379 # point to sentinel, not to redis port + - + host: 10.0.0.2 + port: 26379 # point to sentinel, not to redis port + - + host: 10.0.0.3 + port: 26379 # point to sentinel, not to redis port +``` + +When in doubt, read the [Redis Sentinel documentation](https://redis.io/topics/sentinel). |