Welcome to mirror list, hosted at ThFree Co, Russian Federation.

praefect.md « doc - gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: e9c4f5e36bcd85c94401cd8fca4b469e0afff407 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# Running Praefect

This document describes how to run praefect.

## Failover

There are two ways to do a failover from one internal gitaly node to another as the primary. Manually, or automatically.

As an example, in this config.toml we have 1 virtual storage named "default" with 2 internal gitaly nodes behind it.
One is deemed the "primary". This means that read and write traffic will go to `internal_storage_0`, and writes
will get replicated to `internal_storage_1`.

```toml
socket_path = "/path/to/praefect.socket"

# failover_enabled will enable automatic failover
failover_enabled = false

[logging]
format = "json"
level = "info"

[[virtual_storage]]
name = "default"

[[virtual_storage.node]]
  name = "internal_storage_0"
  address = "tcp://localhost:9999"
  primary = true
  token = "supersecret"

[[virtual_storage.node]]
  name = "internal_storage_1"
  address = "tcp://localhost:9998"
  token = "supersecret"
```

### Manual Failover

In order to failover from using one internal gitaly node to using another, a manual failover step can be used. Unless `failover_enabled` is set to `true`
in the config.toml, the only way to fail over from one primary to using another node as the primary is to do a manual failover.


1. Edit config.toml by moving `primary = true` from the current `[[virtual_storage.node]]`, to another one:
```toml
[[virtual_storage.node]]
  name = "internal_storage_0"
  address = "tcp://localhost:9999"
  # no longer the primary
  token = "supersecret"

[[virtual_storage.node]]
  name = "internal_storage_1"
  address = "tcp://localhost:9998"
  # this is the new primary
  primary = true
  token = "supersecret"
```

1. On a restart, praefect will send write traffic to `internal_storage_1`. `internal_storage_0` is the new secondary now,
and replication jobs will be created to replicate repository data to `internal_storage_0` **from** `internal_storage_1`

## Automatic Failover

When `failover_enabled` is set to true in the config.toml, Praefect will do automatic detection of the health of
internal gitaly nodes. If the primary has a certain amount of healthchecks fail, it will decide to promote one of the
secondaries to be primary, and demote the primary to be a secondary.

```toml
# failover_enabled turns on automatic failover
failover_enabled = true

[[virtual_storage.node]]
  name = "internal_storage_0"
  address = "tcp://localhost:9999"
  primary = true
  token = "supersecret"

[[virtual_storage.node]]
  name = "internal_storage_1"
  address = "tcp://localhost:9998"
  token = "supersecret"
```

Below is the picture when praefect starts up with the config.toml above:

```mermaid
graph TD
  A[Praefect] -->|Mutator RPC| B(internal_storage_0)
  B --> |Replication|C[internal_storage_1]
```

Let's say suddenly `internal_storage_0` goes down. Praefect will detect this and
automatically switch over to `internal_storage_1`, and `internal_storage_0` will serve as a secondary:

```mermaid
graph TD
  A[Praefect] -->|Mutator RPC| B(internal_storage_1)
  B --> |Replication|C[internal_storage_0]
```

NOTE: Currently this feature is supported for setups that only have 1 praefect. If there are 2 or more
praefect instances running, for instance behind a load balancer, `failover_enabled` should be disabled. The reason is
because there is no coordination that currently happens across different praefect instances, so there could be a situation
where two praefects think two different gitaly nodes are the primary.