Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2021-03-04 15:11:11 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2021-03-04 15:11:11 +0300
commitd5d47b45ddddcef0f8fc80a35ca7a8a2a0765fd1 (patch)
tree3a595b7b17458fca99dcf3e808dc4c4c49ac0620 /doc
parentb0a5a92e8349ef7b6284f7e5571620e21bed1cad (diff)
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc')
-rw-r--r--doc/administration/reference_architectures/10k_users.md40
-rw-r--r--doc/administration/reference_architectures/25k_users.md829
-rw-r--r--doc/administration/reference_architectures/3k_users.md783
-rw-r--r--doc/administration/reference_architectures/50k_users.md843
-rw-r--r--doc/administration/reference_architectures/5k_users.md784
-rw-r--r--doc/api/graphql/reference/gitlab_schema.json2
-rw-r--r--doc/api/graphql/reference/index.md22
-rw-r--r--doc/api/vulnerability_findings.md2
-rw-r--r--doc/ci/pipelines/index.md4
-rw-r--r--doc/ci/yaml/README.md444
-rw-r--r--doc/integration/elasticsearch.md2
-rw-r--r--doc/raketasks/backup_restore.md2
-rw-r--r--doc/user/admin_area/settings/continuous_integration.md7
13 files changed, 2492 insertions, 1272 deletions
diff --git a/doc/administration/reference_architectures/10k_users.md b/doc/administration/reference_architectures/10k_users.md
index d6a38e1b713..cdc9baf7573 100644
--- a/doc/administration/reference_architectures/10k_users.md
+++ b/doc/administration/reference_architectures/10k_users.md
@@ -17,21 +17,21 @@ full list of reference architectures, see
| Service | Nodes | Configuration | GCP | AWS | Azure |
|--------------------------------------------|-------------|-------------------------|-----------------|-------------|----------|
-| External load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Consul | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| PostgreSQL | 3 | 8 vCPU, 30 GB memory | n1-standard-8 | `m5.2xlarge` | D8s v3 |
-| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Internal load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Redis - Cache | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| Redis - Queues / Shared State | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| Redis Sentinel - Cache | 3 | 1 vCPU, 1.7 GB memory | g1-small | `t3.small` | B1MS |
-| Redis Sentinel - Queues / Shared State | 3 | 1 vCPU, 1.7 GB memory | g1-small | `t3.small` | B1MS |
-| Gitaly Cluster | 3 | 16 vCPU, 60 GB memory | n1-standard-16 | `m5.4xlarge` | D16s v3 |
-| Praefect | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Praefect PostgreSQL | 1+* | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Sidekiq | 4 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| GitLab Rails | 3 | 32 vCPU, 28.8 GB memory | n1-highcpu-32 | `c5.9xlarge` | F32s v2 |
-| Monitoring node | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
+| External load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Consul | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| PostgreSQL | 3 | 8 vCPU, 30 GB memory | n1-standard-8 | m5.2xlarge | D8s v3 |
+| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Internal load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Redis - Cache | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| Redis - Queues / Shared State | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| Redis Sentinel - Cache | 3 | 1 vCPU, 1.7 GB memory | g1-small | t3.small | B1MS |
+| Redis Sentinel - Queues / Shared State | 3 | 1 vCPU, 1.7 GB memory | g1-small | t3.small | B1MS |
+| Gitaly | 3 | 16 vCPU, 60 GB memory | n1-standard-16 | m5.4xlarge | D16s v3 |
+| Praefect | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Praefect PostgreSQL | 1+* | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Sidekiq | 4 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| GitLab Rails | 3 | 32 vCPU, 28.8 GB memory | n1-highcpu-32 | c5.9xlarge | F32s v2 |
+| Monitoring node | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
| Object storage | n/a | n/a | n/a | n/a | n/a |
| NFS server | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
@@ -206,7 +206,7 @@ The following list includes descriptions of each server and its assigned IP:
- `10.6.0.111`: GitLab application 1
- `10.6.0.112`: GitLab application 2
- `10.6.0.113`: GitLab application 3
-- `10.6.0.121`: Prometheus
+- `10.6.0.151`: Prometheus
## Configure the external load balancer
@@ -1927,7 +1927,7 @@ To configure the Sidekiq nodes, on each one:
node_exporter['listen_address'] = '0.0.0.0:9100'
# Rails Status for prometheus
- gitlab_rails['monitoring_whitelist'] = ['10.6.0.121/32', '127.0.0.0/8']
+ gitlab_rails['monitoring_whitelist'] = ['10.6.0.151/32', '127.0.0.0/8']
#############################
### Object storage ###
@@ -2055,8 +2055,8 @@ On each node perform the following:
# Add the monitoring node's IP address to the monitoring whitelist and allow it to
# scrape the NGINX metrics
- gitlab_rails['monitoring_whitelist'] = ['10.6.0.121/32', '127.0.0.0/8']
- nginx['status']['options']['allow'] = ['10.6.0.121/32', '127.0.0.0/8']
+ gitlab_rails['monitoring_whitelist'] = ['10.6.0.151/32', '127.0.0.0/8']
+ nginx['status']['options']['allow'] = ['10.6.0.151/32', '127.0.0.0/8']
#############################
### Object storage ###
@@ -2192,7 +2192,7 @@ running [Prometheus](../monitoring/prometheus/index.md) and
The following IP will be used as an example:
-- `10.6.0.121`: Prometheus
+- `10.6.0.151`: Prometheus
To configure the Monitoring node:
diff --git a/doc/administration/reference_architectures/25k_users.md b/doc/administration/reference_architectures/25k_users.md
index d02f7ea66ac..188d298c58b 100644
--- a/doc/administration/reference_architectures/25k_users.md
+++ b/doc/administration/reference_architectures/25k_users.md
@@ -12,100 +12,110 @@ full list of reference architectures, see
[Available reference architectures](index.md#available-reference-architectures).
> - **Supported users (approximate):** 25,000
-> - **High Availability:** Yes
+> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Test requests per second (RPS) rates:** API: 500 RPS, Web: 50 RPS, Git (Pull): 50 RPS, Git (Push): 10 RPS
| Service | Nodes | Configuration | GCP | AWS | Azure |
|-----------------------------------------|-------------|-------------------------|-----------------|-------------|----------|
-| External load balancing node | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
-| Consul | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| PostgreSQL | 3 | 8 vCPU, 30 GB memory | n1-standard-8 | `m5.2xlarge` | D8s v3 |
-| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Internal load balancing node | 1 | 4 vCPU, 3.6GB memory | n1-highcpu-4 | `c5.large` | F2s v2 |
-| Redis - Cache | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| Redis - Queues / Shared State | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| Redis Sentinel - Cache | 3 | 1 vCPU, 1.7 GB memory | g1-small | `t3.small` | B1MS |
-| Redis Sentinel - Queues / Shared State | 3 | 1 vCPU, 1.7 GB memory | g1-small | `t3.small` | B1MS |
-| Gitaly | 2 (minimum) | 32 vCPU, 120 GB memory | n1-standard-32 | `m5.8xlarge` | D32s v3 |
-| Sidekiq | 4 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| GitLab Rails | 5 | 32 vCPU, 28.8 GB memory | n1-highcpu-32 | `c5.9xlarge` | F32s v2 |
-| Monitoring node | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
+| External load balancing node | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
+| Consul | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| PostgreSQL | 3 | 16 vCPU, 60 GB memory | n1-standard-16 | m5.4xlarge | D16s v3 |
+| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Internal load balancing node | 1 | 4 vCPU, 3.6GB memory | n1-highcpu-4 | c5.large | F2s v2 |
+| Redis - Cache | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| Redis - Queues / Shared State | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| Redis Sentinel - Cache | 3 | 1 vCPU, 1.7 GB memory | g1-small | t3.small | B1MS |
+| Redis Sentinel - Queues / Shared State | 3 | 1 vCPU, 1.7 GB memory | g1-small | t3.small | B1MS |
+| Gitaly | 3 | 32 vCPU, 120 GB memory | n1-standard-32 | m5.8xlarge | D32s v3 |
+| Praefect | 3 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
+| Praefect PostgreSQL | 1+* | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Sidekiq | 4 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| GitLab Rails | 5 | 32 vCPU, 28.8 GB memory | n1-highcpu-32 | c5.9xlarge | F32s v2 |
+| Monitoring node | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
| Object storage | n/a | n/a | n/a | n/a | n/a |
-| NFS server | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
-
-```mermaid
-stateDiagram-v2
- [*] --> LoadBalancer
- LoadBalancer --> ApplicationServer
-
- ApplicationServer --> BackgroundJobs
- ApplicationServer --> Gitaly
- ApplicationServer --> Redis_Cache
- ApplicationServer --> Redis_Queues
- ApplicationServer --> PgBouncer
- PgBouncer --> Database
- ApplicationServer --> ObjectStorage
- BackgroundJobs --> ObjectStorage
-
- ApplicationMonitoring -->ApplicationServer
- ApplicationMonitoring -->PgBouncer
- ApplicationMonitoring -->Database
- ApplicationMonitoring -->BackgroundJobs
-
- ApplicationServer --> Consul
-
- Consul --> Database
- Consul --> PgBouncer
- Redis_Cache --> Consul
- Redis_Queues --> Consul
- BackgroundJobs --> Consul
-
- state Consul {
- "Consul_1..3"
- }
-
- state Database {
- "PG_Primary_Node"
- "PG_Secondary_Node_1..2"
- }
-
- state Redis_Cache {
- "R_Cache_Primary_Node"
- "R_Cache_Replica_Node_1..2"
- "R_Cache_Sentinel_1..3"
- }
-
- state Redis_Queues {
- "R_Queues_Primary_Node"
- "R_Queues_Replica_Node_1..2"
- "R_Queues_Sentinel_1..3"
- }
-
- state Gitaly {
- "Gitaly_1..2"
- }
-
- state BackgroundJobs {
- "Sidekiq_1..4"
- }
-
- state ApplicationServer {
- "GitLab_Rails_1..5"
- }
-
- state LoadBalancer {
- "LoadBalancer_1"
- }
-
- state ApplicationMonitoring {
- "Prometheus"
- "Grafana"
- }
-
- state PgBouncer {
- "Internal_Load_Balancer"
- "PgBouncer_1..3"
- }
+| NFS server | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
+
+```plantuml
+@startuml 25k
+card "**External Load Balancer**" as elb #6a9be7
+card "**Internal Load Balancer**" as ilb #9370DB
+
+together {
+ collections "**GitLab Rails** x5" as gitlab #32CD32
+ collections "**Sidekiq** x4" as sidekiq #ff8dd1
+}
+
+together {
+ card "**Prometheus + Grafana**" as monitor #7FFFD4
+ collections "**Consul** x3" as consul #e76a9b
+}
+
+card "Gitaly Cluster" as gitaly_cluster {
+ collections "**Praefect** x3" as praefect #FF8C00
+ collections "**Gitaly** x3" as gitaly #FF8C00
+ card "**Praefect PostgreSQL***\n//Non fault-tolerant//" as praefect_postgres #FF8C00
+
+ praefect -[#FF8C00]-> gitaly
+ praefect -[#FF8C00]> praefect_postgres
+}
+
+card "Database" as database {
+ collections "**PGBouncer** x3" as pgbouncer #4EA7FF
+ card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+
+ pgbouncer -[#4EA7FF]-> postgres_primary
+ postgres_primary .[#4EA7FF]> postgres_secondary
+}
+
+card "redis" as redis {
+ collections "**Redis Persistent** x3" as redis_persistent #FF6347
+ collections "**Redis Cache** x3" as redis_cache #FF6347
+ collections "**Redis Persistent Sentinel** x3" as redis_persistent_sentinel #FF6347
+ collections "**Redis Cache Sentinel** x3"as redis_cache_sentinel #FF6347
+
+ redis_persistent <.[#FF6347]- redis_persistent_sentinel
+ redis_cache <.[#FF6347]- redis_cache_sentinel
+}
+
+cloud "**Object Storage**" as object_storage #white
+
+elb -[#6a9be7]-> gitlab
+elb -[#6a9be7]--> monitor
+
+gitlab -[#32CD32]> sidekiq
+gitlab -[#32CD32]--> ilb
+gitlab -[#32CD32]-> object_storage
+gitlab -[#32CD32]---> redis
+gitlab -[hidden]-> monitor
+gitlab -[hidden]-> consul
+
+sidekiq -[#ff8dd1]--> ilb
+sidekiq -[#ff8dd1]-> object_storage
+sidekiq -[#ff8dd1]---> redis
+sidekiq -[hidden]-> monitor
+sidekiq -[hidden]-> consul
+
+ilb -[#9370DB]-> gitaly_cluster
+ilb -[#9370DB]-> database
+
+consul .[#e76a9b]u-> gitlab
+consul .[#e76a9b]u-> sidekiq
+consul .[#e76a9b]> monitor
+consul .[#e76a9b]-> database
+consul .[#e76a9b]-> gitaly_cluster
+consul .[#e76a9b,norank]--> redis
+
+monitor .[#7FFFD4]u-> gitlab
+monitor .[#7FFFD4]u-> sidekiq
+monitor .[#7FFFD4]> consul
+monitor .[#7FFFD4]-> database
+monitor .[#7FFFD4]-> gitaly_cluster
+monitor .[#7FFFD4,norank]--> redis
+monitor .[#7FFFD4]> ilb
+monitor .[#7FFFD4,norank]u--> elb
+
+@enduml
```
The Google Cloud Platform (GCP) architectures were built and tested using the
@@ -120,19 +130,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
is recommended instead of using NFS. Using an object storage service also
doesn't require you to provision and maintain a node.
+It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+that to achieve full High Availability a third party PostgreSQL database solution will be required.
+We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+
## Setup components
To set up GitLab and its components to accommodate up to 25,000 users:
-1. [Configure the external load balancing node](#configure-the-external-load-balancer)
+1. [Configure the external load balancer](#configure-the-external-load-balancer)
to handle the load balancing of the GitLab application services nodes.
+1. [Configure the internal load balancer](#configure-the-internal-load-balancer).
+ to handle the load balancing of GitLab application internal connections.
1. [Configure Consul](#configure-consul).
1. [Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1. [Configure PgBouncer](#configure-pgbouncer).
-1. [Configure the internal load balancing node](#configure-the-internal-load-balancer).
1. [Configure Redis](#configure-redis).
-1. [Configure Gitaly](#configure-gitaly),
- which provides access to the Git repositories.
+1. [Configure Gitaly Cluster](#configure-gitaly-cluster),
+ provides access to the Git repositories.
1. [Configure Sidekiq](#configure-sidekiq).
1. [Configure the main GitLab Rails application](#configure-gitlab-rails)
to run Puma/Unicorn, Workhorse, GitLab Shell, and to serve all frontend
@@ -178,6 +194,11 @@ The following list includes descriptions of each server and its assigned IP:
- `10.6.0.83`: Sentinel - Queues 3
- `10.6.0.91`: Gitaly 1
- `10.6.0.92`: Gitaly 2
+- `10.6.0.93`: Gitaly 3
+- `10.6.0.131`: Praefect 1
+- `10.6.0.132`: Praefect 2
+- `10.6.0.133`: Praefect 3
+- `10.6.0.141`: Praefect PostgreSQL 1 (non HA)
- `10.6.0.101`: Sidekiq 1
- `10.6.0.102`: Sidekiq 2
- `10.6.0.103`: Sidekiq 3
@@ -185,7 +206,9 @@ The following list includes descriptions of each server and its assigned IP:
- `10.6.0.111`: GitLab application 1
- `10.6.0.112`: GitLab application 2
- `10.6.0.113`: GitLab application 3
-- `10.6.0.121`: Prometheus
+- `10.6.0.114`: GitLab application 4
+- `10.6.0.115`: GitLab application 5
+- `10.6.0.151`: Prometheus
## Configure the external load balancer
@@ -308,6 +331,71 @@ Configure DNS for an alternate SSH hostname such as `altssh.gitlab.example.com`.
</a>
</div>
+## Configure the internal load balancer
+
+The Internal Load Balancer is used to balance any internal connections the GitLab environment requires
+such as connections to [PgBouncer](#configure-pgbouncer) and [Praefect](#configure-praefect) (Gitaly Cluster).
+
+Note that it's a separate node from the External Load Balancer and shouldn't have any access externally.
+
+The following IP will be used as an example:
+
+- `10.6.0.40`: Internal Load Balancer
+
+Here's how you could do it with [HAProxy](https://www.haproxy.org/):
+
+```plaintext
+global
+ log /dev/log local0
+ log localhost local1 notice
+ log stdout format raw local0
+
+defaults
+ log global
+ default-server inter 10s fall 3 rise 2
+ balance leastconn
+
+frontend internal-pgbouncer-tcp-in
+ bind *:6432
+ mode tcp
+ option tcplog
+
+ default_backend pgbouncer
+
+frontend internal-praefect-tcp-in
+ bind *:2305
+ mode tcp
+ option tcplog
+ option clitcpka
+
+ default_backend praefect
+
+backend pgbouncer
+ mode tcp
+ option tcp-check
+
+ server pgbouncer1 10.6.0.21:6432 check
+ server pgbouncer2 10.6.0.22:6432 check
+ server pgbouncer3 10.6.0.23:6432 check
+
+backend praefect
+ mode tcp
+ option tcp-check
+ option srvtcpka
+
+ server praefect1 10.6.0.131:2305 check
+ server praefect2 10.6.0.132:2305 check
+ server praefect3 10.6.0.133:2305 check
+```
+
+Refer to your preferred Load Balancer's documentation for further guidance.
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
## Configure Consul
The following IPs will be used as an example:
@@ -662,52 +750,6 @@ The following IPs will be used as an example:
</a>
</div>
-### Configure the internal load balancer
-
-If you're running more than one PgBouncer node as recommended, then at this time you'll need to set
-up a TCP internal load balancer to serve each correctly.
-
-The following IP will be used as an example:
-
-- `10.6.0.40`: Internal Load Balancer
-
-Here's how you could do it with [HAProxy](https://www.haproxy.org/):
-
-```plaintext
-global
- log /dev/log local0
- log localhost local1 notice
- log stdout format raw local0
-
-defaults
- log global
- default-server inter 10s fall 3 rise 2
- balance leastconn
-
-frontend internal-pgbouncer-tcp-in
- bind *:6432
- mode tcp
- option tcplog
-
- default_backend pgbouncer
-
-backend pgbouncer
- mode tcp
- option tcp-check
-
- server pgbouncer1 10.6.0.21:6432 check
- server pgbouncer2 10.6.0.22:6432 check
- server pgbouncer3 10.6.0.23:6432 check
-```
-
-Refer to your preferred Load Balancer's documentation for further guidance.
-
-<div align="right">
- <a type="button" class="btn btn-default" href="#setup-components">
- Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
- </a>
-</div>
-
## Configure Redis
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
@@ -1302,19 +1344,283 @@ To configure the Sentinel Queues server:
</a>
</div>
-## Configure Gitaly
+## Configure Gitaly Cluster
-NOTE:
-[Gitaly Cluster](../gitaly/praefect.md) support
-for the Reference Architectures is being
-worked on as a [collaborative effort](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/1) between the Quality Engineering and Gitaly teams. When this component has been verified
-some Architecture specs will likely change as a result to support the new
-and improved designed.
+[Gitaly Cluster](../gitaly/praefect.md) is a GitLab provided and recommended fault tolerant solution for storing Git repositories.
+In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
+
+The recommended cluster setup includes the following components:
+
+- 3 Gitaly nodes: Replicated storage of Git repositories.
+- 3 Praefect nodes: Router and transaction manager for Gitaly Cluster.
+- 1 Praefect PostgreSQL node: Database server for Praefect. A third-party solution
+ is required for Praefect database connections to be made highly available.
+- 1 load balancer: A load balancer is required for Praefect. The
+ [internal load balancer](#configure-the-internal-load-balancer) will be used.
+
+This section will detail how to configure the recommended standard setup in order.
+For more advanced setups refer to the [standalone Gitaly Cluster documentation](../gitaly/praefect.md).
+
+### Configure Praefect PostgreSQL
+
+Praefect, the routing and transaction manager for Gitaly Cluster, requires its own database server to store data on Gitaly Cluster status.
+
+If you want to have a highly available setup, Praefect requires a third-party PostgreSQL database.
+A built-in solution is being [worked on](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919).
+
+#### Praefect non-HA PostgreSQL standalone using Omnibus GitLab
+
+The following IPs will be used as an example:
+
+- `10.6.0.141`: Praefect PostgreSQL
+
+First, make sure to [install](https://about.gitlab.com/install/)
+the Linux GitLab package in the Praefect PostgreSQL node. Following the steps,
+install the necessary dependencies from step 1, and add the
+GitLab package repository from step 2. When installing GitLab
+in the second step, do not supply the `EXTERNAL_URL` value.
+
+1. SSH in to the Praefect PostgreSQL node.
+1. Create a strong password to be used for the Praefect PostgreSQL user. Take note of this password as `<praefect_postgresql_password>`.
+1. Generate the password hash for the Praefect PostgreSQL username/password pair. This assumes you will use the default
+ username of `praefect` (recommended). The command will request the password `<praefect_postgresql_password>`
+ and confirmation. Use the value that is output by this command in the next
+ step as the value of `<praefect_postgresql_password_hash>`:
+
+ ```shell
+ sudo gitlab-ctl pg-password-md5 praefect
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
+
+ ```ruby
+ # Disable all components except PostgreSQL and Consul
+ roles ['postgres_role']
+ repmgr['enable'] = false
+ patroni['enable'] = false
+
+ # PostgreSQL configuration
+ postgresql['listen_address'] = '0.0.0.0'
+ postgresql['max_connections'] = 200
+
+ gitlab_rails['auto_migrate'] = false
-[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
-specifically the number of projects and those projects' sizes. It's recommended
-that a Gitaly server node stores no more than 5 TB of data. Depending on your
-repository storage requirements, you may require additional Gitaly server nodes.
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ # Replace PRAEFECT_POSTGRESQL_PASSWORD_HASH with a generated md5 value
+ postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>"
+
+ # Replace XXX.XXX.XXX.XXX/YY with Network Address
+ postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ postgres_exporter['listen_address'] = '0.0.0.0:9187'
+
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+ #
+ # END user configuration
+ ```
+
+1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
+1. Follow the [post configuration](#praefect-postgresql-post-configuration).
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
+#### Praefect HA PostgreSQL third-party solution
+
+[As noted](#configure-praefect-postgresql), a third-party PostgreSQL solution for
+Praefect's database is recommended if aiming for full High Availability.
+
+There are many third-party solutions for PostgreSQL HA. The solution selected must have the following to work with Praefect:
+
+- A static IP for all connections that doesn't change on failover.
+- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
+
+Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
+
+Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
+
+#### Praefect PostgreSQL post-configuration
+
+After the Praefect PostgreSQL server has been set up, you'll then need to configure the user and database for Praefect to use.
+
+We recommend the user be named `praefect` and the database `praefect_production`, and these can be configured as standard in PostgreSQL.
+The password for the user is the same as the one you configured earlier as `<praefect_postgresql_password>`.
+
+This is how this would work with a Omnibus GitLab PostgreSQL setup:
+
+1. SSH in to the Praefect PostgreSQL node.
+1. Connect to the PostgreSQL server with administrative access.
+ The `gitlab-psql` user should be used here for this as it's added by default in Omnibus.
+ The database `template1` is used because it is created by default on all PostgreSQL servers.
+
+ ```shell
+ /opt/gitlab/embedded/bin/psql -U gitlab-psql -d template1 -h POSTGRESQL_SERVER_ADDRESS
+ ```
+
+1. Create the new user `praefect`, replacing `<praefect_postgresql_password>`:
+
+ ```shell
+ CREATE ROLE praefect WITH LOGIN CREATEDB PASSWORD <praefect_postgresql_password>;
+ ```
+
+1. Reconnect to the PostgreSQL server, this time as the `praefect` user:
+
+ ```shell
+ /opt/gitlab/embedded/bin/psql -U praefect -d template1 -h POSTGRESQL_SERVER_ADDRESS
+ ```
+
+1. Create a new database `praefect_production`:
+
+ ```shell
+ CREATE DATABASE praefect_production WITH ENCODING=UTF8;
+ ```
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
+### Configure Praefect
+
+Praefect is the router and transaction manager for Gitaly Cluster and all connections to Gitaly go through
+it. This section details how to configure it.
+
+Praefect requires several secret tokens to secure communications across the Cluster:
+
+- `<praefect_external_token>`: Used for repositories hosted on your Gitaly cluster and can only be accessed by Gitaly clients that carry this token.
+- `<praefect_internal_token>`: Used for replication traffic inside your Gitaly cluster. This is distinct from `praefect_external_token` because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.
+- `<praefect_postgresql_password>`: The Praefect PostgreSQL password defined in the previous section is also required as part of this setup.
+
+Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
+the details of each Gitaly node that makes up the cluster. Each storage is also given a name
+and this name is used in several areas of the config. In this guide, the name of the storage will be
+`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
+to use Gitaly Cluster, you may need to use a different name.
+Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
+
+The following IPs will be used as an example:
+
+- `10.6.0.131`: Praefect 1
+- `10.6.0.132`: Praefect 2
+- `10.6.0.133`: Praefect 3
+
+To configure the Praefect nodes, on each one:
+
+1. SSH in to the Praefect server.
+1. [Download and install](https://about.gitlab.com/install/) the Omnibus GitLab
+ package of your choice. Be sure to follow _only_ installation steps 1 and 2
+ on the page.
+1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
+
+ ```ruby
+ # Avoid running unnecessary services on the Gitaly server
+ postgresql['enable'] = false
+ redis['enable'] = false
+ nginx['enable'] = false
+ puma['enable'] = false
+ unicorn['enable'] = false
+ sidekiq['enable'] = false
+ gitlab_workhorse['enable'] = false
+ grafana['enable'] = false
+
+ # If you run a separate monitoring node you can disable these services
+ alertmanager['enable'] = false
+ prometheus['enable'] = false
+
+ # Praefect Configuration
+ praefect['enable'] = true
+ praefect['listen_addr'] = '0.0.0.0:2305'
+
+ gitlab_rails['rake_cache_clear'] = false
+ gitlab_rails['auto_migrate'] = false
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+
+ # Praefect External Token
+ # This is needed by clients outside the cluster (like GitLab Shell) to communicate with the Praefect cluster
+ praefect['auth_token'] = '<praefect_external_token>'
+
+ # Praefect Database Settings
+ praefect['database_host'] = '10.6.0.141'
+ praefect['database_port'] = 5432
+ # `no_proxy` settings must always be a direct connection for caching
+ praefect['database_host_no_proxy'] = '10.6.0.141'
+ praefect['database_port_no_proxy'] = 5432
+ praefect['database_dbname'] = 'praefect_production'
+ praefect['database_user'] = 'praefect'
+ praefect['database_password'] = '<praefect_postgresql_password>'
+
+ # Praefect Virtual Storage config
+ # Name of storage hash must match storage name in git_data_dirs on GitLab
+ # server ('praefect') and in git_data_dirs on Gitaly nodes ('gitaly-1')
+ praefect['virtual_storages'] = {
+ 'default' => {
+ 'gitaly-1' => {
+ 'address' => 'tcp://10.6.0.91:8075',
+ 'token' => '<praefect_internal_token>',
+ 'primary' => true
+ },
+ 'gitaly-2' => {
+ 'address' => 'tcp://10.6.0.92:8075',
+ 'token' => '<praefect_internal_token>'
+ },
+ 'gitaly-3' => {
+ 'address' => 'tcp://10.6.0.93:8075',
+ 'token' => '<praefect_internal_token>'
+ },
+ }
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ praefect['prometheus_listen_addr'] = '0.0.0.0:9652'
+
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+ #
+ # END user configuration
+ ```
+
+ 1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and
+ then replace the file of the same name on this server. If that file isn't on
+ this server, add the file from your Consul server to this server.
+
+ 1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
+
+### Configure Gitaly
+
+The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
+requirements that are dependent on data, specifically the number of projects
+and those projects' sizes. It's recommended that a Gitaly Cluster stores
+no more than 5 TB of data on each node. Depending on your
+repository storage requirements, you may require additional Gitaly Clusters.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -1325,36 +1631,21 @@ adjusted to greater or lesser values depending on the scale of your
environment's workload. If you're running the environment on a Cloud provider,
refer to their documentation about how to configure IOPS correctly.
-Be sure to note the following items:
+Gitaly servers must not be exposed to the public internet, as Gitaly's network
+traffic is unencrypted by default. The use of a firewall is highly recommended
+to restrict access to the Gitaly server. Another option is to
+[use TLS](#gitaly-cluster-tls-support).
-- The GitLab Rails application shards repositories into
- [repository storage paths](../repository_storage_paths.md).
-- A Gitaly server can host one or more storage paths.
-- A GitLab server can use one or more Gitaly server nodes.
-- Gitaly addresses must be specified to be correctly resolvable for all Gitaly
- clients.
-- Gitaly servers must not be exposed to the public internet, as Gitaly's network
- traffic is unencrypted by default. The use of a firewall is highly recommended
- to restrict access to the Gitaly server. Another option is to
- [use TLS](#gitaly-tls-support).
+For configuring Gitaly you should note the following:
-NOTE:
-The token referred to throughout the Gitaly documentation is an arbitrary
-password selected by the administrator. This token is unrelated to tokens
-created for the GitLab API or other similar web API tokens.
-
-This section describes how to configure two Gitaly servers, with the following
-IPs and domain names:
-
-- `10.6.0.91`: Gitaly 1 (`gitaly1.internal`)
-- `10.6.0.92`: Gitaly 2 (`gitaly2.internal`)
+- `git_data_dirs` should be configured to reflect the storage path for the specific Gitaly node
+- `auth_token` should be the same as `praefect_internal_token`
-Assumptions about your servers include having the secret token be `gitalysecret`,
-and that your GitLab installation has three repository storages:
+The following IPs will be used as an example:
-- `default` on Gitaly 1
-- `storage1` on Gitaly 1
-- `storage2` on Gitaly 2
+- `10.6.0.91`: Gitaly 1
+- `10.6.0.92`: Gitaly 2
+- `10.6.0.93`: Gitaly 3
On each node:
@@ -1364,21 +1655,9 @@ On each node:
1. Edit the Gitaly server node's `/etc/gitlab/gitlab.rb` file to configure
storage paths, enable the network listener, and to configure the token:
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
-
```ruby
# /etc/gitlab/gitlab.rb
- # Gitaly and GitLab use two shared secrets for authentication, one to authenticate gRPC requests
- # to Gitaly, and a second for authentication callbacks from GitLab-Shell to the GitLab internal API.
- # The following two values must be the same as their respective values
- # of the GitLab Rails application setup
- gitaly['auth_token'] = 'gitalysecret'
- gitlab_shell['secret_token'] = 'shellsecret'
-
# Avoid running unnecessary services on the Gitaly server
postgresql['enable'] = false
redis['enable'] = false
@@ -1407,36 +1686,42 @@ On each node:
# firewalls to restrict access to this address/port.
# Comment out following line if you only want to support TLS connections
gitaly['listen_addr'] = "0.0.0.0:8075"
+
+ # Gitaly Auth Token
+ # Should be the same as praefect_internal_token
+ gitaly['auth_token'] = '<praefect_internal_token>'
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
- - On `gitaly1.internal`:
+ - On Gitaly node 1:
```ruby
git_data_dirs({
- 'default' => {
- 'path' => '/var/opt/gitlab/git-data'
- },
- 'storage1' => {
- 'path' => '/mnt/gitlab/git-data'
- },
+ "gitaly-1" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
})
```
- - On `gitaly2.internal`:
+ - On Gitaly node 2:
```ruby
git_data_dirs({
- 'storage2' => {
- 'path' => '/mnt/gitlab/git-data'
- },
+ "gitaly-2" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
})
```
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
+ - On Gitaly node 3:
+
+ ```ruby
+ git_data_dirs({
+ "gitaly-3" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
+ })
+ ```
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and
then replace the file of the same name on this server. If that file isn't on
@@ -1444,34 +1729,44 @@ On each node:
1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
-### Gitaly TLS support
+### Gitaly Cluster TLS support
-Gitaly supports TLS encryption. To be able to communicate
-with a Gitaly instance that listens for secure connections you will need to use `tls://` URL
-scheme in the `gitaly_address` of the corresponding storage entry in the GitLab configuration.
+Praefect supports TLS encryption. To communicate with a Praefect instance that listens
+for secure connections, you must:
-You will need to bring your own certificates as this isn't provided automatically.
-The certificate, or its certificate authority, must be installed on all Gitaly
-nodes (including the Gitaly node using the certificate) and on all client nodes
-that communicate with it following the procedure described in
-[GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates).
+- Use a `tls://` URL scheme in the `gitaly_address` of the corresponding storage entry
+ in the GitLab configuration.
+- Bring your own certificates because this isn't provided automatically. The certificate
+ corresponding to each Praefect server must be installed on that Praefect server.
-NOTE:
-The self-signed certificate must specify the address you use to access the
-Gitaly server. If you are addressing the Gitaly server by a hostname, you can
-either use the Common Name field for this, or add it as a Subject Alternative
-Name. If you are addressing the Gitaly server by its IP address, you must add it
-as a Subject Alternative Name to the certificate.
-[gRPC does not support using an IP address as Common Name in a certificate](https://github.com/grpc/grpc/issues/2691).
+Additionally the certificate, or its certificate authority, must be installed on all Gitaly servers
+and on all Praefect clients that communicate with it following the procedure described in
+[GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates) (and repeated below).
-It's possible to configure Gitaly servers with both an unencrypted listening
-address (`listen_addr`) and an encrypted listening address (`tls_listen_addr`)
-at the same time. This allows you to do a gradual transition from unencrypted to
-encrypted traffic, if necessary.
+Note the following:
-To configure Gitaly with TLS:
+- The certificate must specify the address you use to access the Praefect server. If
+ addressing the Praefect server by:
-1. Create the `/etc/gitlab/ssl` directory and copy your key and certificate there:
+ - Hostname, you can either use the Common Name field for this, or add it as a Subject
+ Alternative Name.
+ - IP address, you must add it as a Subject Alternative Name to the certificate.
+
+- You can configure Praefect servers with both an unencrypted listening address
+ `listen_addr` and an encrypted listening address `tls_listen_addr` at the same time.
+ This allows you to do a gradual transition from unencrypted to encrypted traffic, if
+ necessary.
+
+- The Internal Load Balancer will also access to the certificates and need to be configured
+ to allow for TLS passthrough.
+ Refer to the load balancers documentation on how to configure this.
+
+To configure Praefect with TLS:
+
+1. Create certificates for Praefect servers.
+
+1. On the Praefect servers, create the `/etc/gitlab/ssl` directory and copy your key
+ and certificate there:
```shell
sudo mkdir -p /etc/gitlab/ssl
@@ -1480,27 +1775,34 @@ To configure Gitaly with TLS:
sudo chmod 644 key.pem cert.pem
```
-1. Copy the cert to `/etc/gitlab/trusted-certs` so Gitaly will trust the cert when
- calling into itself:
+1. Edit `/etc/gitlab/gitlab.rb` and add:
- ```shell
- sudo cp /etc/gitlab/ssl/cert.pem /etc/gitlab/trusted-certs/
+ ```ruby
+ praefect['tls_listen_addr'] = "0.0.0.0:3305"
+ praefect['certificate_path'] = "/etc/gitlab/ssl/cert.pem"
+ praefect['key_path'] = "/etc/gitlab/ssl/key.pem"
```
-1. Edit `/etc/gitlab/gitlab.rb` and add:
+1. Save the file and [reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure).
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
+1. On the Praefect clients (including each Gitaly server), copy the certificates,
+ or their certificate authority, into `/etc/gitlab/trusted-certs`:
- ```ruby
- gitaly['tls_listen_addr'] = "0.0.0.0:9999"
- gitaly['certificate_path'] = "/etc/gitlab/ssl/cert.pem"
- gitaly['key_path'] = "/etc/gitlab/ssl/key.pem"
+ ```shell
+ sudo cp cert.pem /etc/gitlab/trusted-certs/
```
-1. Delete `gitaly['listen_addr']` to allow only encrypted connections.
+1. On the Praefect clients (except Gitaly servers), edit `git_data_dirs` in
+ `/etc/gitlab/gitlab.rb` as follows:
+
+ ```ruby
+ git_data_dirs({
+ "default" => {
+ "gitaly_address" => 'tls://LOAD_BALANCER_SERVER_ADDRESS:2305',
+ "gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN'
+ }
+ })
+ ```
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
@@ -1587,12 +1889,15 @@ To configure the Sidekiq nodes, on each one:
### Gitaly ###
#######################################
+ # git_data_dirs get configured for the Praefect virtual storage
+ # Address is Internal Load Balancer for Praefect
+ # Token is praefect_external_token
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage1' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage2' => { 'gitaly_address' => 'tcp://gitaly2.internal:8075' },
+ "default" => {
+ "gitaly_address" => "tcp://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
- gitlab_rails['gitaly_token'] = 'YOUR_TOKEN'
#######################################
### Postgres ###
@@ -1624,7 +1929,7 @@ To configure the Sidekiq nodes, on each one:
node_exporter['listen_address'] = '0.0.0.0:9100'
# Rails Status for prometheus
- gitlab_rails['monitoring_whitelist'] = ['10.6.0.121/32', '127.0.0.0/8']
+ gitlab_rails['monitoring_whitelist'] = ['10.6.0.151/32', '127.0.0.0/8']
#############################
### Object storage ###
@@ -1671,6 +1976,8 @@ The following IPs will be used as an example:
- `10.6.0.111`: GitLab application 1
- `10.6.0.112`: GitLab application 2
- `10.6.0.113`: GitLab application 3
+- `10.6.0.114`: GitLab application 4
+- `10.6.0.115`: GitLab application 5
On each node perform the following:
@@ -1690,17 +1997,14 @@ On each node perform the following:
```ruby
external_url 'https://gitlab.example.com'
- # Gitaly and GitLab use two shared secrets for authentication, one to authenticate gRPC requests
- # to Gitaly, and a second for authentication callbacks from GitLab-Shell to the GitLab internal API.
- # The following two values must be the same as their respective values
- # of the Gitaly setup
- gitlab_rails['gitaly_token'] = 'gitalysecret'
- gitlab_shell['secret_token'] = 'shellsecret'
-
+ # git_data_dirs get configured for the Praefect virtual storage
+ # Address is Internal Load Balancer for Praefect
+ # Token is praefect_external_token
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage1' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage2' => { 'gitaly_address' => 'tcp://gitaly2.internal:8075' },
+ "default" => {
+ "gitaly_address" => "tcp://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
## Disable components that will not be on the GitLab application server
@@ -1755,8 +2059,8 @@ On each node perform the following:
# Add the monitoring node's IP address to the monitoring whitelist and allow it to
# scrape the NGINX metrics
- gitlab_rails['monitoring_whitelist'] = ['10.6.0.121/32', '127.0.0.0/8']
- nginx['status']['options']['allow'] = ['10.6.0.121/32', '127.0.0.0/8']
+ gitlab_rails['monitoring_whitelist'] = ['10.6.0.151/32', '127.0.0.0/8']
+ nginx['status']['options']['allow'] = ['10.6.0.151/32', '127.0.0.0/8']
#############################
### Object storage ###
@@ -1779,14 +2083,15 @@ On each node perform the following:
```
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
-1. If you're using [Gitaly with TLS support](#gitaly-tls-support), make sure the
+1. If you're using [Gitaly with TLS support](#gitaly-cluster-tls-support), make sure the
`git_data_dirs` entry is configured with `tls` instead of `tcp`:
```ruby
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tls://gitaly1.internal:9999' },
- 'storage1' => { 'gitaly_address' => 'tls://gitaly1.internal:9999' },
- 'storage2' => { 'gitaly_address' => 'tls://gitaly2.internal:9999' },
+ "default" => {
+ "gitaly_address" => "tls://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
```
@@ -1891,7 +2196,7 @@ running [Prometheus](../monitoring/prometheus/index.md) and
The following IP will be used as an example:
-- `10.6.0.121`: Prometheus
+- `10.6.0.151`: Prometheus
To configure the Monitoring node:
diff --git a/doc/administration/reference_architectures/3k_users.md b/doc/administration/reference_architectures/3k_users.md
index 593bfaf7282..43c93000ed6 100644
--- a/doc/administration/reference_architectures/3k_users.md
+++ b/doc/administration/reference_architectures/3k_users.md
@@ -20,78 +20,107 @@ For a full list of reference architectures, see
[Available reference architectures](index.md#available-reference-architectures).
> - **Supported users (approximate):** 3,000
-> - **High Availability:** Yes
+> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Test requests per second (RPS) rates:** API: 60 RPS, Web: 6 RPS, Git (Pull): 6 RPS, Git (Push): 1 RPS
| Service | Nodes | Configuration | GCP | AWS | Azure |
|--------------------------------------------|-------------|-----------------------|----------------|-------------|---------|
-| External load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Redis | 3 | 2 vCPU, 7.5 GB memory | n1-standard-2 | `m5.large` | D2s v3 |
-| Consul + Sentinel | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| PostgreSQL | 3 | 2 vCPU, 7.5 GB memory | n1-standard-2 | `m5.large` | D2s v3 |
-| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Internal load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Gitaly | 2 (minimum) | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| Sidekiq | 4 | 2 vCPU, 7.5 GB memory | n1-standard-2 | `m5.large` | D2s v3 |
-| GitLab Rails | 3 | 8 vCPU, 7.2 GB memory | n1-highcpu-8 | `c5.2xlarge` | F8s v2 |
-| Monitoring node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
+| External load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Redis | 3 | 2 vCPU, 7.5 GB memory | n1-standard-2 | m5.large | D2s v3 |
+| Consul + Sentinel | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| PostgreSQL | 3 | 2 vCPU, 7.5 GB memory | n1-standard-2 | m5.large | D2s v3 |
+| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Internal load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Gitaly | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| Praefect | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Praefect PostgreSQL | 1+* | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Sidekiq | 4 | 2 vCPU, 7.5 GB memory | n1-standard-2 | m5.large | D2s v3 |
+| GitLab Rails | 3 | 8 vCPU, 7.2 GB memory | n1-highcpu-8 | c5.2xlarge | F8s v2 |
+| Monitoring node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
| Object storage | n/a | n/a | n/a | n/a | n/a |
-| NFS server (optional, not recommended) | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
-
-```mermaid
-stateDiagram-v2
- [*] --> LoadBalancer
- LoadBalancer --> ApplicationServer
-
- ApplicationServer --> BackgroundJobs
- ApplicationServer --> Gitaly
- ApplicationServer --> Redis
- ApplicationServer --> PgBouncer
- PgBouncer --> Database
- ApplicationServer --> ObjectStorage
- BackgroundJobs --> ObjectStorage
-
- ApplicationMonitoring -->ApplicationServer
- ApplicationMonitoring -->Redis
- ApplicationMonitoring -->PgBouncer
- ApplicationMonitoring -->Database
- ApplicationMonitoring -->BackgroundJobs
-
- state Database {
- "PG_Primary_Node"
- "PG_Secondary_Node_1..2"
- }
- state Redis {
- "R_Primary_Node"
- "R_Replica_Node_1..2"
- "R_Consul/Sentinel_1..3"
- }
-
- state Gitaly {
- "Gitaly_1..2"
- }
-
- state BackgroundJobs {
- "Sidekiq_1..4"
- }
-
- state ApplicationServer {
- "GitLab_Rails_1..3"
- }
-
- state LoadBalancer {
- "LoadBalancer_1"
- }
-
- state ApplicationMonitoring {
- "Prometheus"
- "Grafana"
- }
-
- state PgBouncer {
- "Internal_Load_Balancer"
- "PgBouncer_1..3"
- }
+| NFS server (optional, not recommended) | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
+
+```plantuml
+@startuml 3k
+card "**External Load Balancer**" as elb #6a9be7
+card "**Internal Load Balancer**" as ilb #9370DB
+
+together {
+ collections "**GitLab Rails** x3" as gitlab #32CD32
+ collections "**Sidekiq** x4" as sidekiq #ff8dd1
+}
+
+together {
+ card "**Prometheus + Grafana**" as monitor #7FFFD4
+ collections "**Consul** x3" as consul #e76a9b
+}
+
+card "Gitaly Cluster" as gitaly_cluster {
+ collections "**Praefect** x3" as praefect #FF8C00
+ collections "**Gitaly** x3" as gitaly #FF8C00
+ card "**Praefect PostgreSQL***\n//Non fault-tolerant//" as praefect_postgres #FF8C00
+
+ praefect -[#FF8C00]-> gitaly
+ praefect -[#FF8C00]> praefect_postgres
+}
+
+card "Database" as database {
+ collections "**PGBouncer** x3" as pgbouncer #4EA7FF
+ card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+
+ pgbouncer -[#4EA7FF]-> postgres_primary
+ postgres_primary .[#4EA7FF]> postgres_secondary
+}
+
+card "redis" as redis {
+ collections "**Redis Persistent** x3" as redis_persistent #FF6347
+ collections "**Redis Cache** x3" as redis_cache #FF6347
+ collections "**Redis Persistent Sentinel** x3" as redis_persistent_sentinel #FF6347
+ collections "**Redis Cache Sentinel** x3"as redis_cache_sentinel #FF6347
+
+ redis_persistent <.[#FF6347]- redis_persistent_sentinel
+ redis_cache <.[#FF6347]- redis_cache_sentinel
+}
+
+cloud "**Object Storage**" as object_storage #white
+
+elb -[#6a9be7]-> gitlab
+elb -[#6a9be7]--> monitor
+
+gitlab -[#32CD32]> sidekiq
+gitlab -[#32CD32]--> ilb
+gitlab -[#32CD32]-> object_storage
+gitlab -[#32CD32]---> redis
+gitlab -[hidden]-> monitor
+gitlab -[hidden]-> consul
+
+sidekiq -[#ff8dd1]--> ilb
+sidekiq -[#ff8dd1]-> object_storage
+sidekiq -[#ff8dd1]---> redis
+sidekiq -[hidden]-> monitor
+sidekiq -[hidden]-> consul
+
+ilb -[#9370DB]-> gitaly_cluster
+ilb -[#9370DB]-> database
+
+consul .[#e76a9b]u-> gitlab
+consul .[#e76a9b]u-> sidekiq
+consul .[#e76a9b]> monitor
+consul .[#e76a9b]-> database
+consul .[#e76a9b]-> gitaly_cluster
+consul .[#e76a9b,norank]--> redis
+
+monitor .[#7FFFD4]u-> gitlab
+monitor .[#7FFFD4]u-> sidekiq
+monitor .[#7FFFD4]> consul
+monitor .[#7FFFD4]-> database
+monitor .[#7FFFD4]-> gitaly_cluster
+monitor .[#7FFFD4,norank]--> redis
+monitor .[#7FFFD4]> ilb
+monitor .[#7FFFD4,norank]u--> elb
+
+@enduml
```
The Google Cloud Platform (GCP) architectures were built and tested using the
@@ -106,19 +135,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
is recommended instead of using NFS. Using an object storage service also
doesn't require you to provision and maintain a node.
+It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+that to achieve full High Availability a third party PostgreSQL database solution will be required.
+We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+
## Setup components
To set up GitLab and its components to accommodate up to 3,000 users:
-1. [Configure the external load balancing node](#configure-the-external-load-balancer)
+1. [Configure the external load balancer](#configure-the-external-load-balancer)
to handle the load balancing of the GitLab application services nodes.
+1. [Configure the internal load balancer](#configure-the-internal-load-balancer).
+ to handle the load balancing of GitLab application internal connections.
1. [Configure Redis](#configure-redis).
1. [Configure Consul and Sentinel](#configure-consul-and-sentinel).
1. [Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1. [Configure PgBouncer](#configure-pgbouncer).
-1. [Configure the internal load balancing node](#configure-the-internal-load-balancer).
-1. [Configure Gitaly](#configure-gitaly),
- which provides access to the Git repositories.
+1. [Configure Gitaly Cluster](#configure-gitaly-cluster),
+ provides access to the Git repositories.
1. [Configure Sidekiq](#configure-sidekiq).
1. [Configure the main GitLab Rails application](#configure-gitlab-rails)
to run Puma/Unicorn, Workhorse, GitLab Shell, and to serve all frontend
@@ -155,6 +190,11 @@ The following list includes descriptions of each server and its assigned IP:
- `10.6.0.20`: Internal Load Balancer
- `10.6.0.51`: Gitaly 1
- `10.6.0.52`: Gitaly 2
+- `10.6.0.93`: Gitaly 3
+- `10.6.0.131`: Praefect 1
+- `10.6.0.132`: Praefect 2
+- `10.6.0.133`: Praefect 3
+- `10.6.0.141`: Praefect PostgreSQL 1 (non HA)
- `10.6.0.71`: Sidekiq 1
- `10.6.0.72`: Sidekiq 2
- `10.6.0.73`: Sidekiq 3
@@ -285,6 +325,71 @@ Configure DNS for an alternate SSH hostname such as `altssh.gitlab.example.com`.
</a>
</div>
+## Configure the internal load balancer
+
+The Internal Load Balancer is used to balance any internal connections the GitLab environment requires
+such as connections to [PgBouncer](#configure-pgbouncer) and [Praefect](#configure-praefect) (Gitaly Cluster).
+
+Note that it's a separate node from the External Load Balancer and shouldn't have any access externally.
+
+The following IP will be used as an example:
+
+- `10.6.0.40`: Internal Load Balancer
+
+Here's how you could do it with [HAProxy](https://www.haproxy.org/):
+
+```plaintext
+global
+ log /dev/log local0
+ log localhost local1 notice
+ log stdout format raw local0
+
+defaults
+ log global
+ default-server inter 10s fall 3 rise 2
+ balance leastconn
+
+frontend internal-pgbouncer-tcp-in
+ bind *:6432
+ mode tcp
+ option tcplog
+
+ default_backend pgbouncer
+
+frontend internal-praefect-tcp-in
+ bind *:2305
+ mode tcp
+ option tcplog
+ option clitcpka
+
+ default_backend praefect
+
+backend pgbouncer
+ mode tcp
+ option tcp-check
+
+ server pgbouncer1 10.6.0.21:6432 check
+ server pgbouncer2 10.6.0.22:6432 check
+ server pgbouncer3 10.6.0.23:6432 check
+
+backend praefect
+ mode tcp
+ option tcp-check
+ option srvtcpka
+
+ server praefect1 10.6.0.131:2305 check
+ server praefect2 10.6.0.132:2305 check
+ server praefect3 10.6.0.133:2305 check
+```
+
+Refer to your preferred Load Balancer's documentation for further guidance.
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
## Configure Redis
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
@@ -925,45 +1030,96 @@ The following IPs will be used as an example:
</a>
</div>
-### Configure the internal load balancer
+## Configure Gitaly Cluster
-If you're running more than one PgBouncer node as recommended, then at this time you'll need to set
-up a TCP internal load balancer to serve each correctly.
+[Gitaly Cluster](../gitaly/praefect.md) is a GitLab provided and recommended fault tolerant solution for storing Git repositories.
+In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
-The following IP will be used as an example:
+The recommended cluster setup includes the following components:
-- `10.6.0.20`: Internal Load Balancer
+- 3 Gitaly nodes: Replicated storage of Git repositories.
+- 3 Praefect nodes: Router and transaction manager for Gitaly Cluster.
+- 1 Praefect PostgreSQL node: Database server for Praefect. A third-party solution
+ is required for Praefect database connections to be made highly available.
+- 1 load balancer: A load balancer is required for Praefect. The
+ [internal load balancer](#configure-the-internal-load-balancer) will be used.
-Here's how you could do it with [HAProxy](https://www.haproxy.org/):
+This section will detail how to configure the recommended standard setup in order.
+For more advanced setups refer to the [standalone Gitaly Cluster documentation](../gitaly/praefect.md).
-```plaintext
-global
- log /dev/log local0
- log localhost local1 notice
- log stdout format raw local0
+### Configure Praefect PostgreSQL
-defaults
- log global
- default-server inter 10s fall 3 rise 2
- balance leastconn
+Praefect, the routing and transaction manager for Gitaly Cluster, requires its own database server to store data on Gitaly Cluster status.
-frontend internal-pgbouncer-tcp-in
- bind *:6432
- mode tcp
- option tcplog
+If you want to have a highly available setup, Praefect requires a third-party PostgreSQL database.
+A built-in solution is being [worked on](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919).
- default_backend pgbouncer
+#### Praefect non-HA PostgreSQL standalone using Omnibus GitLab
-backend pgbouncer
- mode tcp
- option tcp-check
+The following IPs will be used as an example:
- server pgbouncer1 10.6.0.21:6432 check
- server pgbouncer2 10.6.0.22:6432 check
- server pgbouncer3 10.6.0.23:6432 check
-```
+- `10.6.0.141`: Praefect PostgreSQL
-Refer to your preferred Load Balancer's documentation for further guidance.
+First, make sure to [install](https://about.gitlab.com/install/)
+the Linux GitLab package in the Praefect PostgreSQL node. Following the steps,
+install the necessary dependencies from step 1, and add the
+GitLab package repository from step 2. When installing GitLab
+in the second step, do not supply the `EXTERNAL_URL` value.
+
+1. SSH in to the Praefect PostgreSQL node.
+1. Create a strong password to be used for the Praefect PostgreSQL user. Take note of this password as `<praefect_postgresql_password>`.
+1. Generate the password hash for the Praefect PostgreSQL username/password pair. This assumes you will use the default
+ username of `praefect` (recommended). The command will request the password `<praefect_postgresql_password>`
+ and confirmation. Use the value that is output by this command in the next
+ step as the value of `<praefect_postgresql_password_hash>`:
+
+ ```shell
+ sudo gitlab-ctl pg-password-md5 praefect
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
+
+ ```ruby
+ # Disable all components except PostgreSQL and Consul
+ roles ['postgres_role']
+ repmgr['enable'] = false
+ patroni['enable'] = false
+
+ # PostgreSQL configuration
+ postgresql['listen_address'] = '0.0.0.0'
+ postgresql['max_connections'] = 200
+
+ gitlab_rails['auto_migrate'] = false
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ # Replace PRAEFECT_POSTGRESQL_PASSWORD_HASH with a generated md5 value
+ postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>"
+
+ # Replace XXX.XXX.XXX.XXX/YY with Network Address
+ postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ postgres_exporter['listen_address'] = '0.0.0.0:9187'
+
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+ #
+ # END user configuration
+ ```
+
+1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
+1. Follow the [post configuration](#praefect-postgresql-post-configuration).
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
@@ -971,19 +1127,186 @@ Refer to your preferred Load Balancer's documentation for further guidance.
</a>
</div>
-## Configure Gitaly
+#### Praefect HA PostgreSQL third-party solution
-NOTE:
-[Gitaly Cluster](../gitaly/praefect.md) support
-for the Reference Architectures is being
-worked on as a [collaborative effort](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/1) between the Quality Engineering and Gitaly teams. When this component has been verified
-some Architecture specs will likely change as a result to support the new
-and improved designed.
+[As noted](#configure-praefect-postgresql), a third-party PostgreSQL solution for
+Praefect's database is recommended if aiming for full High Availability.
+
+There are many third-party solutions for PostgreSQL HA. The solution selected must have the following to work with Praefect:
+
+- A static IP for all connections that doesn't change on failover.
+- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
+
+Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
+
+Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
+
+#### Praefect PostgreSQL post-configuration
+
+After the Praefect PostgreSQL server has been set up, you'll then need to configure the user and database for Praefect to use.
+
+We recommend the user be named `praefect` and the database `praefect_production`, and these can be configured as standard in PostgreSQL.
+The password for the user is the same as the one you configured earlier as `<praefect_postgresql_password>`.
+
+This is how this would work with a Omnibus GitLab PostgreSQL setup:
+
+1. SSH in to the Praefect PostgreSQL node.
+1. Connect to the PostgreSQL server with administrative access.
+ The `gitlab-psql` user should be used here for this as it's added by default in Omnibus.
+ The database `template1` is used because it is created by default on all PostgreSQL servers.
-[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
-specifically the number of projects and those projects' sizes. It's recommended
-that a Gitaly server node stores no more than 5 TB of data. Depending on your
-repository storage requirements, you may require additional Gitaly server nodes.
+ ```shell
+ /opt/gitlab/embedded/bin/psql -U gitlab-psql -d template1 -h POSTGRESQL_SERVER_ADDRESS
+ ```
+
+1. Create the new user `praefect`, replacing `<praefect_postgresql_password>`:
+
+ ```shell
+ CREATE ROLE praefect WITH LOGIN CREATEDB PASSWORD <praefect_postgresql_password>;
+ ```
+
+1. Reconnect to the PostgreSQL server, this time as the `praefect` user:
+
+ ```shell
+ /opt/gitlab/embedded/bin/psql -U praefect -d template1 -h POSTGRESQL_SERVER_ADDRESS
+ ```
+
+1. Create a new database `praefect_production`:
+
+ ```shell
+ CREATE DATABASE praefect_production WITH ENCODING=UTF8;
+ ```
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
+### Configure Praefect
+
+Praefect is the router and transaction manager for Gitaly Cluster and all connections to Gitaly go through
+it. This section details how to configure it.
+
+Praefect requires several secret tokens to secure communications across the Cluster:
+
+- `<praefect_external_token>`: Used for repositories hosted on your Gitaly cluster and can only be accessed by Gitaly clients that carry this token.
+- `<praefect_internal_token>`: Used for replication traffic inside your Gitaly cluster. This is distinct from `praefect_external_token` because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.
+- `<praefect_postgresql_password>`: The Praefect PostgreSQL password defined in the previous section is also required as part of this setup.
+
+Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
+the details of each Gitaly node that makes up the cluster. Each storage is also given a name
+and this name is used in several areas of the config. In this guide, the name of the storage will be
+`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
+to use Gitaly Cluster, you may need to use a different name.
+Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
+
+The following IPs will be used as an example:
+
+- `10.6.0.131`: Praefect 1
+- `10.6.0.132`: Praefect 2
+- `10.6.0.133`: Praefect 3
+
+To configure the Praefect nodes, on each one:
+
+1. SSH in to the Praefect server.
+1. [Download and install](https://about.gitlab.com/install/) the Omnibus GitLab
+ package of your choice. Be sure to follow _only_ installation steps 1 and 2
+ on the page.
+1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
+
+ ```ruby
+ # Avoid running unnecessary services on the Gitaly server
+ postgresql['enable'] = false
+ redis['enable'] = false
+ nginx['enable'] = false
+ puma['enable'] = false
+ unicorn['enable'] = false
+ sidekiq['enable'] = false
+ gitlab_workhorse['enable'] = false
+ grafana['enable'] = false
+
+ # If you run a separate monitoring node you can disable these services
+ alertmanager['enable'] = false
+ prometheus['enable'] = false
+
+ # Praefect Configuration
+ praefect['enable'] = true
+ praefect['listen_addr'] = '0.0.0.0:2305'
+
+ gitlab_rails['rake_cache_clear'] = false
+ gitlab_rails['auto_migrate'] = false
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+
+ # Praefect External Token
+ # This is needed by clients outside the cluster (like GitLab Shell) to communicate with the Praefect cluster
+ praefect['auth_token'] = '<praefect_external_token>'
+
+ # Praefect Database Settings
+ praefect['database_host'] = '10.6.0.141'
+ praefect['database_port'] = 5432
+ # `no_proxy` settings must always be a direct connection for caching
+ praefect['database_host_no_proxy'] = '10.6.0.141'
+ praefect['database_port_no_proxy'] = 5432
+ praefect['database_dbname'] = 'praefect_production'
+ praefect['database_user'] = 'praefect'
+ praefect['database_password'] = '<praefect_postgresql_password>'
+
+ # Praefect Virtual Storage config
+ # Name of storage hash must match storage name in git_data_dirs on GitLab
+ # server ('praefect') and in git_data_dirs on Gitaly nodes ('gitaly-1')
+ praefect['virtual_storages'] = {
+ 'default' => {
+ 'gitaly-1' => {
+ 'address' => 'tcp://10.6.0.91:8075',
+ 'token' => '<praefect_internal_token>',
+ 'primary' => true
+ },
+ 'gitaly-2' => {
+ 'address' => 'tcp://10.6.0.92:8075',
+ 'token' => '<praefect_internal_token>'
+ },
+ 'gitaly-3' => {
+ 'address' => 'tcp://10.6.0.93:8075',
+ 'token' => '<praefect_internal_token>'
+ },
+ }
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ praefect['prometheus_listen_addr'] = '0.0.0.0:9652'
+
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+ #
+ # END user configuration
+ ```
+
+ 1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and
+ then replace the file of the same name on this server. If that file isn't on
+ this server, add the file from your Consul server to this server.
+
+ 1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
+
+### Configure Gitaly
+
+The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
+requirements that are dependent on data, specifically the number of projects
+and those projects' sizes. It's recommended that a Gitaly Cluster stores
+no more than 5 TB of data on each node. Depending on your
+repository storage requirements, you may require additional Gitaly Clusters.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -994,36 +1317,21 @@ adjusted to greater or lesser values depending on the scale of your
environment's workload. If you're running the environment on a Cloud provider,
refer to their documentation about how to configure IOPS correctly.
-Be sure to note the following items:
+Gitaly servers must not be exposed to the public internet, as Gitaly's network
+traffic is unencrypted by default. The use of a firewall is highly recommended
+to restrict access to the Gitaly server. Another option is to
+[use TLS](#gitaly-cluster-tls-support).
-- The GitLab Rails application shards repositories into
- [repository storage paths](../repository_storage_paths.md).
-- A Gitaly server can host one or more storage paths.
-- A GitLab server can use one or more Gitaly server nodes.
-- Gitaly addresses must be specified to be correctly resolvable for all Gitaly
- clients.
-- Gitaly servers must not be exposed to the public internet, as Gitaly's network
- traffic is unencrypted by default. The use of a firewall is highly recommended
- to restrict access to the Gitaly server. Another option is to
- [use TLS](#gitaly-tls-support).
-
-NOTE:
-The token referred to throughout the Gitaly documentation is an arbitrary
-password selected by the administrator. This token is unrelated to tokens
-created for the GitLab API or other similar web API tokens.
+For configuring Gitaly you should note the following:
-This section describes how to configure two Gitaly servers, with the following
-IPs and domain names:
+- `git_data_dirs` should be configured to reflect the storage path for the specific Gitaly node
+- `auth_token` should be the same as `praefect_internal_token`
-- `10.6.0.51`: Gitaly 1 (`gitaly1.internal`)
-- `10.6.0.52`: Gitaly 2 (`gitaly2.internal`)
-
-Assumptions about your servers include having the secret token be `gitalysecret`,
-and that your GitLab installation has three repository storages:
+The following IPs will be used as an example:
-- `default` on Gitaly 1
-- `storage1` on Gitaly 1
-- `storage2` on Gitaly 2
+- `10.6.0.91`: Gitaly 1
+- `10.6.0.92`: Gitaly 2
+- `10.6.0.93`: Gitaly 3
On each node:
@@ -1033,21 +1341,9 @@ On each node:
1. Edit the Gitaly server node's `/etc/gitlab/gitlab.rb` file to configure
storage paths, enable the network listener, and to configure the token:
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
-
```ruby
# /etc/gitlab/gitlab.rb
- # Gitaly and GitLab use two shared secrets for authentication, one to authenticate gRPC requests
- # to Gitaly, and a second for authentication callbacks from GitLab-Shell to the GitLab internal API.
- # The following two values must be the same as their respective values
- # of the GitLab Rails application setup
- gitaly['auth_token'] = 'gitalysecret'
- gitlab_shell['secret_token'] = 'shellsecret'
-
# Avoid running unnecessary services on the Gitaly server
postgresql['enable'] = false
redis['enable'] = false
@@ -1057,7 +1353,6 @@ On each node:
sidekiq['enable'] = false
gitlab_workhorse['enable'] = false
grafana['enable'] = false
- gitlab_exporter['enable'] = false
# If you run a separate monitoring node you can disable these services
alertmanager['enable'] = false
@@ -1078,101 +1373,86 @@ On each node:
# Comment out following line if you only want to support TLS connections
gitaly['listen_addr'] = "0.0.0.0:8075"
- ## Enable service discovery for Prometheus
- consul['enable'] = true
- consul['monitoring_service_discovery'] = true
-
- # Set the network addresses that the exporters will listen on for monitoring
- gitaly['prometheus_listen_addr'] = "0.0.0.0:9236"
- node_exporter['listen_address'] = '0.0.0.0:9100'
- gitlab_rails['prometheus_address'] = '10.6.0.81:9090'
-
- ## The IPs of the Consul server nodes
- ## You can also use FQDNs and intermix them with IPs
- consul['configuration'] = {
- retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
- }
+ # Gitaly Auth Token
+ # Should be the same as praefect_internal_token
+ gitaly['auth_token'] = '<praefect_internal_token>'
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
- - On `gitaly1.internal`:
+ - On Gitaly node 1:
```ruby
git_data_dirs({
- 'default' => {
- 'path' => '/var/opt/gitlab/git-data'
- },
- 'storage1' => {
- 'path' => '/mnt/gitlab/git-data'
- },
+ "gitaly-1" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
})
```
- - On `gitaly2.internal`:
+ - On Gitaly node 2:
```ruby
git_data_dirs({
- 'storage2' => {
- 'path' => '/mnt/gitlab/git-data'
- },
+ "gitaly-2" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
})
```
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
+ - On Gitaly node 3:
+
+ ```ruby
+ git_data_dirs({
+ "gitaly-3" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
+ })
+ ```
+
+1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and
+ then replace the file of the same name on this server. If that file isn't on
+ this server, add the file from your Consul server to this server.
1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
-1. Confirm that Gitaly can perform callbacks to the internal API:
- ```shell
- sudo /opt/gitlab/embedded/bin/gitaly-hooks check /var/opt/gitlab/gitaly/config.toml
- ```
+### Gitaly Cluster TLS support
-1. Verify the GitLab services are running:
+Praefect supports TLS encryption. To communicate with a Praefect instance that listens
+for secure connections, you must:
- ```shell
- sudo gitlab-ctl status
- ```
+- Use a `tls://` URL scheme in the `gitaly_address` of the corresponding storage entry
+ in the GitLab configuration.
+- Bring your own certificates because this isn't provided automatically. The certificate
+ corresponding to each Praefect server must be installed on that Praefect server.
- The output should be similar to the following:
+Additionally the certificate, or its certificate authority, must be installed on all Gitaly servers
+and on all Praefect clients that communicate with it following the procedure described in
+[GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates) (and repeated below).
- ```plaintext
- run: consul: (pid 30339) 77006s; run: log: (pid 29878) 77020s
- run: gitaly: (pid 30351) 77005s; run: log: (pid 29660) 77040s
- run: logrotate: (pid 7760) 3213s; run: log: (pid 29782) 77032s
- run: node-exporter: (pid 30378) 77004s; run: log: (pid 29812) 77026s
- ```
+Note the following:
-### Gitaly TLS support
+- The certificate must specify the address you use to access the Praefect server. If
+ addressing the Praefect server by:
-Gitaly supports TLS encryption. To be able to communicate
-with a Gitaly instance that listens for secure connections you will need to use `tls://` URL
-scheme in the `gitaly_address` of the corresponding storage entry in the GitLab configuration.
+ - Hostname, you can either use the Common Name field for this, or add it as a Subject
+ Alternative Name.
+ - IP address, you must add it as a Subject Alternative Name to the certificate.
-You will need to bring your own certificates as this isn't provided automatically.
-The certificate, or its certificate authority, must be installed on all Gitaly
-nodes (including the Gitaly node using the certificate) and on all client nodes
-that communicate with it following the procedure described in
-[GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates).
+- You can configure Praefect servers with both an unencrypted listening address
+ `listen_addr` and an encrypted listening address `tls_listen_addr` at the same time.
+ This allows you to do a gradual transition from unencrypted to encrypted traffic, if
+ necessary.
-NOTE:
-The self-signed certificate must specify the address you use to access the
-Gitaly server. If you are addressing the Gitaly server by a hostname, you can
-either use the Common Name field for this, or add it as a Subject Alternative
-Name. If you are addressing the Gitaly server by its IP address, you must add it
-as a Subject Alternative Name to the certificate.
-[gRPC does not support using an IP address as Common Name in a certificate](https://github.com/grpc/grpc/issues/2691).
+- The Internal Load Balancer will also access to the certificates and need to be configured
+ to allow for TLS passthrough.
+ Refer to the load balancers documentation on how to configure this.
-It's possible to configure Gitaly servers with both an unencrypted listening
-address (`listen_addr`) and an encrypted listening address (`tls_listen_addr`)
-at the same time. This allows you to do a gradual transition from unencrypted to
-encrypted traffic, if necessary.
+To configure Praefect with TLS:
-To configure Gitaly with TLS:
+1. Create certificates for Praefect servers.
-1. Create the `/etc/gitlab/ssl` directory and copy your key and certificate there:
+1. On the Praefect servers, create the `/etc/gitlab/ssl` directory and copy your key
+ and certificate there:
```shell
sudo mkdir -p /etc/gitlab/ssl
@@ -1181,27 +1461,35 @@ To configure Gitaly with TLS:
sudo chmod 644 key.pem cert.pem
```
-1. Copy the cert to `/etc/gitlab/trusted-certs` so Gitaly will trust the cert when
- calling into itself:
+1. Edit `/etc/gitlab/gitlab.rb` and add:
- ```shell
- sudo cp /etc/gitlab/ssl/cert.pem /etc/gitlab/trusted-certs/
+ ```ruby
+ praefect['tls_listen_addr'] = "0.0.0.0:3305"
+ praefect['certificate_path'] = "/etc/gitlab/ssl/cert.pem"
+ praefect['key_path'] = "/etc/gitlab/ssl/key.pem"
```
-1. Edit `/etc/gitlab/gitlab.rb` and add:
+1. Save the file and [reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure).
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
+1. On the Praefect clients (including each Gitaly server), copy the certificates,
+ or their certificate authority, into `/etc/gitlab/trusted-certs`:
+
+ ```shell
+ sudo cp cert.pem /etc/gitlab/trusted-certs/
+ ```
+
+1. On the Praefect clients (except Gitaly servers), edit `git_data_dirs` in
+ `/etc/gitlab/gitlab.rb` as follows:
```ruby
- gitaly['tls_listen_addr'] = "0.0.0.0:9999"
- gitaly['certificate_path'] = "/etc/gitlab/ssl/cert.pem"
- gitaly['key_path'] = "/etc/gitlab/ssl/key.pem"
+ git_data_dirs({
+ "default" => {
+ "gitaly_address" => 'tls://LOAD_BALANCER_SERVER_ADDRESS:2305',
+ "gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN'
+ }
+ })
```
-1. Delete `gitaly['listen_addr']` to allow only encrypted connections.
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
<div align="right">
@@ -1272,17 +1560,20 @@ To configure the Sidekiq nodes, one each one:
### Gitaly ###
#######################################
+ # git_data_dirs get configured for the Praefect virtual storage
+ # Address is Internal Load Balancer for Praefect
+ # Token is praefect_external_token
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage1' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage2' => { 'gitaly_address' => 'tcp://gitaly2.internal:8075' },
+ "default" => {
+ "gitaly_address" => "tcp://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
- gitlab_rails['gitaly_token'] = 'YOUR_TOKEN'
#######################################
### Postgres ###
#######################################
- gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
+ gitlab_rails['db_host'] = '10.6.0.40' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
gitlab_rails['db_adapter'] = 'postgresql'
@@ -1401,17 +1692,14 @@ On each node perform the following:
```ruby
external_url 'https://gitlab.example.com'
- # Gitaly and GitLab use two shared secrets for authentication, one to authenticate gRPC requests
- # to Gitaly, and a second for authentication callbacks from GitLab-Shell to the GitLab internal API.
- # The following two values must be the same as their respective values
- # of the Gitaly setup
- gitlab_rails['gitaly_token'] = 'gitalysecret'
- gitlab_shell['secret_token'] = 'shellsecret'
-
+ # git_data_dirs get configured for the Praefect virtual storage
+ # Address is Interal Load Balancer for Praefect
+ # Token is praefect_external_token
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage1' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage2' => { 'gitaly_address' => 'tcp://gitaly2.internal:8075' },
+ "default" => {
+ "gitaly_address" => "tcp://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
## Disable components that will not be on the GitLab application server
@@ -1499,14 +1787,15 @@ On each node perform the following:
gitlab_rails['object_store']['objects']['terraform_state']['bucket'] = "<gcp-terraform-state-bucket-name>"
```
-1. If you're using [Gitaly with TLS support](#gitaly-tls-support), make sure the
+1. If you're using [Gitaly with TLS support](#gitaly-cluster-tls-support), make sure the
`git_data_dirs` entry is configured with `tls` instead of `tcp`:
```ruby
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tls://gitaly1.internal:9999' },
- 'storage1' => { 'gitaly_address' => 'tls://gitaly1.internal:9999' },
- 'storage2' => { 'gitaly_address' => 'tls://gitaly2.internal:9999' },
+ "default" => {
+ "gitaly_address" => "tls://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
```
diff --git a/doc/administration/reference_architectures/50k_users.md b/doc/administration/reference_architectures/50k_users.md
index 04ce39046d8..475a4944492 100644
--- a/doc/administration/reference_architectures/50k_users.md
+++ b/doc/administration/reference_architectures/50k_users.md
@@ -12,100 +12,110 @@ full list of reference architectures, see
[Available reference architectures](index.md#available-reference-architectures).
> - **Supported users (approximate):** 50,000
-> - **High Availability:** Yes
+> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Test requests per second (RPS) rates:** API: 1000 RPS, Web: 100 RPS, Git (Pull): 100 RPS, Git (Push): 20 RPS
| Service | Nodes | Configuration | GCP | AWS | Azure |
|-----------------------------------------|-------------|-------------------------|-----------------|--------------|----------|
-| External load balancing node | 1 | 8 vCPU, 7.2 GB memory | n1-highcpu-8 | `c5.2xlarge` | F8s v2 |
-| Consul | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| PostgreSQL | 3 | 16 vCPU, 60 GB memory | n1-standard-16 | `m5.4xlarge` | D16s v3 |
-| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Internal load balancing node | 1 | 8 vCPU, 7.2 GB memory | n1-highcpu-8 | `c5.2xlarge` | F8s v2 |
-| Redis - Cache | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| Redis - Queues / Shared State | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| Redis Sentinel - Cache | 3 | 1 vCPU, 1.7 GB memory | g1-small | `t3.small` | B1MS |
-| Redis Sentinel - Queues / Shared State | 3 | 1 vCPU, 1.7 GB memory | g1-small | `t3.small` | B1MS |
-| Gitaly | 2 (minimum) | 64 vCPU, 240 GB memory | n1-standard-64 | `m5.16xlarge` | D64s v3 |
-| Sidekiq | 4 | 4 vCPU, 15 GB memory | n1-standard-4 | `m5.xlarge` | D4s v3 |
-| GitLab Rails | 12 | 32 vCPU, 28.8 GB memory | n1-highcpu-32 | `c5.9xlarge` | F32s v2 |
-| Monitoring node | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
+| External load balancing node | 1 | 8 vCPU, 7.2 GB memory | n1-highcpu-8 | c5.2xlarge | F8s v2 |
+| Consul | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| PostgreSQL | 3 | 32 vCPU, 120 GB memory | n1-standard-32 | m5.8xlarge | D32s v3 |
+| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Internal load balancing node | 1 | 8 vCPU, 7.2 GB memory | n1-highcpu-8 | c5.2xlarge | F8s v2 |
+| Redis - Cache | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| Redis - Queues / Shared State | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| Redis Sentinel - Cache | 3 | 1 vCPU, 1.7 GB memory | g1-small | t3.small | B1MS |
+| Redis Sentinel - Queues / Shared State | 3 | 1 vCPU, 1.7 GB memory | g1-small | t3.small | B1MS |
+| Gitaly | 3 | 64 vCPU, 240 GB memory | n1-standard-64 | m5.16xlarge | D64s v3 |
+| Praefect | 3 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
+| Praefect PostgreSQL | 1+* | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Sidekiq | 4 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| GitLab Rails | 12 | 32 vCPU, 28.8 GB memory | n1-highcpu-32 | c5.9xlarge | F32s v2 |
+| Monitoring node | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
| Object storage | n/a | n/a | n/a | n/a | n/a |
-| NFS server | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
-
-```mermaid
-stateDiagram-v2
- [*] --> LoadBalancer
- LoadBalancer --> ApplicationServer
-
- ApplicationServer --> BackgroundJobs
- ApplicationServer --> Gitaly
- ApplicationServer --> Redis_Cache
- ApplicationServer --> Redis_Queues
- ApplicationServer --> PgBouncer
- PgBouncer --> Database
- ApplicationServer --> ObjectStorage
- BackgroundJobs --> ObjectStorage
-
- ApplicationMonitoring -->ApplicationServer
- ApplicationMonitoring -->PgBouncer
- ApplicationMonitoring -->Database
- ApplicationMonitoring -->BackgroundJobs
-
- ApplicationServer --> Consul
-
- Consul --> Database
- Consul --> PgBouncer
- Redis_Cache --> Consul
- Redis_Queues --> Consul
- BackgroundJobs --> Consul
-
- state Consul {
- "Consul_1..3"
- }
-
- state Database {
- "PG_Primary_Node"
- "PG_Secondary_Node_1..2"
- }
-
- state Redis_Cache {
- "R_Cache_Primary_Node"
- "R_Cache_Replica_Node_1..2"
- "R_Cache_Sentinel_1..3"
- }
-
- state Redis_Queues {
- "R_Queues_Primary_Node"
- "R_Queues_Replica_Node_1..2"
- "R_Queues_Sentinel_1..3"
- }
-
- state Gitaly {
- "Gitaly_1..2"
- }
-
- state BackgroundJobs {
- "Sidekiq_1..4"
- }
-
- state ApplicationServer {
- "GitLab_Rails_1..12"
- }
-
- state LoadBalancer {
- "LoadBalancer_1"
- }
-
- state ApplicationMonitoring {
- "Prometheus"
- "Grafana"
- }
-
- state PgBouncer {
- "Internal_Load_Balancer"
- "PgBouncer_1..3"
- }
+| NFS server | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
+
+```plantuml
+@startuml 50k
+card "**External Load Balancer**" as elb #6a9be7
+card "**Internal Load Balancer**" as ilb #9370DB
+
+together {
+ collections "**GitLab Rails** x12" as gitlab #32CD32
+ collections "**Sidekiq** x4" as sidekiq #ff8dd1
+}
+
+together {
+ card "**Prometheus + Grafana**" as monitor #7FFFD4
+ collections "**Consul** x3" as consul #e76a9b
+}
+
+card "Gitaly Cluster" as gitaly_cluster {
+ collections "**Praefect** x3" as praefect #FF8C00
+ collections "**Gitaly** x3" as gitaly #FF8C00
+ card "**Praefect PostgreSQL***\n//Non fault-tolerant//" as praefect_postgres #FF8C00
+
+ praefect -[#FF8C00]-> gitaly
+ praefect -[#FF8C00]> praefect_postgres
+}
+
+card "Database" as database {
+ collections "**PGBouncer** x3" as pgbouncer #4EA7FF
+ card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+
+ pgbouncer -[#4EA7FF]-> postgres_primary
+ postgres_primary .[#4EA7FF]> postgres_secondary
+}
+
+card "redis" as redis {
+ collections "**Redis Persistent** x3" as redis_persistent #FF6347
+ collections "**Redis Cache** x3" as redis_cache #FF6347
+ collections "**Redis Persistent Sentinel** x3" as redis_persistent_sentinel #FF6347
+ collections "**Redis Cache Sentinel** x3"as redis_cache_sentinel #FF6347
+
+ redis_persistent <.[#FF6347]- redis_persistent_sentinel
+ redis_cache <.[#FF6347]- redis_cache_sentinel
+}
+
+cloud "**Object Storage**" as object_storage #white
+
+elb -[#6a9be7]-> gitlab
+elb -[#6a9be7]--> monitor
+
+gitlab -[#32CD32]> sidekiq
+gitlab -[#32CD32]--> ilb
+gitlab -[#32CD32]-> object_storage
+gitlab -[#32CD32]---> redis
+gitlab -[hidden]-> monitor
+gitlab -[hidden]-> consul
+
+sidekiq -[#ff8dd1]--> ilb
+sidekiq -[#ff8dd1]-> object_storage
+sidekiq -[#ff8dd1]---> redis
+sidekiq -[hidden]-> monitor
+sidekiq -[hidden]-> consul
+
+ilb -[#9370DB]-> gitaly_cluster
+ilb -[#9370DB]-> database
+
+consul .[#e76a9b]u-> gitlab
+consul .[#e76a9b]u-> sidekiq
+consul .[#e76a9b]> monitor
+consul .[#e76a9b]-> database
+consul .[#e76a9b]-> gitaly_cluster
+consul .[#e76a9b,norank]--> redis
+
+monitor .[#7FFFD4]u-> gitlab
+monitor .[#7FFFD4]u-> sidekiq
+monitor .[#7FFFD4]> consul
+monitor .[#7FFFD4]-> database
+monitor .[#7FFFD4]-> gitaly_cluster
+monitor .[#7FFFD4,norank]--> redis
+monitor .[#7FFFD4]> ilb
+monitor .[#7FFFD4,norank]u--> elb
+
+@enduml
```
The Google Cloud Platform (GCP) architectures were built and tested using the
@@ -120,19 +130,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
is recommended instead of using NFS. Using an object storage service also
doesn't require you to provision and maintain a node.
+It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+that to achieve full High Availability a third party PostgreSQL database solution will be required.
+We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+
## Setup components
To set up GitLab and its components to accommodate up to 50,000 users:
-1. [Configure the external load balancing node](#configure-the-external-load-balancer)
+1. [Configure the external load balancer](#configure-the-external-load-balancer)
to handle the load balancing of the GitLab application services nodes.
+1. [Configure the internal load balancer](#configure-the-internal-load-balancer).
+ to handle the loa
1. [Configure Consul](#configure-consul).
1. [Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1. [Configure PgBouncer](#configure-pgbouncer).
-1. [Configure the internal load balancing node](#configure-the-internal-load-balancer).
1. [Configure Redis](#configure-redis).
-1. [Configure Gitaly](#configure-gitaly),
- which provides access to the Git repositories.
+1. [Configure Gitaly Cluster](#configure-gitaly-cluster),
+ provides access to the Git repositories.
1. [Configure Sidekiq](#configure-sidekiq).
1. [Configure the main GitLab Rails application](#configure-gitlab-rails)
to run Puma/Unicorn, Workhorse, GitLab Shell, and to serve all frontend
@@ -178,6 +194,11 @@ The following list includes descriptions of each server and its assigned IP:
- `10.6.0.83`: Sentinel - Queues 3
- `10.6.0.91`: Gitaly 1
- `10.6.0.92`: Gitaly 2
+- `10.6.0.93`: Gitaly 3
+- `10.6.0.131`: Praefect 1
+- `10.6.0.132`: Praefect 2
+- `10.6.0.133`: Praefect 3
+- `10.6.0.141`: Praefect PostgreSQL 1 (non HA)
- `10.6.0.101`: Sidekiq 1
- `10.6.0.102`: Sidekiq 2
- `10.6.0.103`: Sidekiq 3
@@ -185,7 +206,16 @@ The following list includes descriptions of each server and its assigned IP:
- `10.6.0.111`: GitLab application 1
- `10.6.0.112`: GitLab application 2
- `10.6.0.113`: GitLab application 3
-- `10.6.0.121`: Prometheus
+- `10.6.0.114`: GitLab application 4
+- `10.6.0.115`: GitLab application 5
+- `10.6.0.116`: GitLab application 6
+- `10.6.0.117`: GitLab application 7
+- `10.6.0.118`: GitLab application 8
+- `10.6.0.119`: GitLab application 9
+- `10.6.0.120`: GitLab application 10
+- `10.6.0.121`: GitLab application 11
+- `10.6.0.122`: GitLab application 12
+- `10.6.0.151`: Prometheus
## Configure the external load balancer
@@ -308,6 +338,71 @@ Configure DNS for an alternate SSH hostname such as `altssh.gitlab.example.com`.
</a>
</div>
+## Configure the internal load balancer
+
+The Internal Load Balancer is used to balance any internal connections the GitLab environment requires
+such as connections to [PgBouncer](#configure-pgbouncer) and [Praefect](#configure-praefect) (Gitaly Cluster).
+
+Note that it's a separate node from the External Load Balancer and shouldn't have any access externally.
+
+The following IP will be used as an example:
+
+- `10.6.0.40`: Internal Load Balancer
+
+Here's how you could do it with [HAProxy](https://www.haproxy.org/):
+
+```plaintext
+global
+ log /dev/log local0
+ log localhost local1 notice
+ log stdout format raw local0
+
+defaults
+ log global
+ default-server inter 10s fall 3 rise 2
+ balance leastconn
+
+frontend internal-pgbouncer-tcp-in
+ bind *:6432
+ mode tcp
+ option tcplog
+
+ default_backend pgbouncer
+
+frontend internal-praefect-tcp-in
+ bind *:2305
+ mode tcp
+ option tcplog
+ option clitcpka
+
+ default_backend praefect
+
+backend pgbouncer
+ mode tcp
+ option tcp-check
+
+ server pgbouncer1 10.6.0.21:6432 check
+ server pgbouncer2 10.6.0.22:6432 check
+ server pgbouncer3 10.6.0.23:6432 check
+
+backend praefect
+ mode tcp
+ option tcp-check
+ option srvtcpka
+
+ server praefect1 10.6.0.131:2305 check
+ server praefect2 10.6.0.132:2305 check
+ server praefect3 10.6.0.133:2305 check
+```
+
+Refer to your preferred Load Balancer's documentation for further guidance.
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
## Configure Consul
The following IPs will be used as an example:
@@ -662,52 +757,6 @@ The following IPs will be used as an example:
</a>
</div>
-### Configure the internal load balancer
-
-If you're running more than one PgBouncer node as recommended, then at this time you'll need to set
-up a TCP internal load balancer to serve each correctly.
-
-The following IP will be used as an example:
-
-- `10.6.0.40`: Internal Load Balancer
-
-Here's how you could do it with [HAProxy](https://www.haproxy.org/):
-
-```plaintext
-global
- log /dev/log local0
- log localhost local1 notice
- log stdout format raw local0
-
-defaults
- log global
- default-server inter 10s fall 3 rise 2
- balance leastconn
-
-frontend internal-pgbouncer-tcp-in
- bind *:6432
- mode tcp
- option tcplog
-
- default_backend pgbouncer
-
-backend pgbouncer
- mode tcp
- option tcp-check
-
- server pgbouncer1 10.6.0.21:6432 check
- server pgbouncer2 10.6.0.22:6432 check
- server pgbouncer3 10.6.0.23:6432 check
-```
-
-Refer to your preferred Load Balancer's documentation for further guidance.
-
-<div align="right">
- <a type="button" class="btn btn-default" href="#setup-components">
- Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
- </a>
-</div>
-
## Configure Redis
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
@@ -1302,19 +1351,283 @@ To configure the Sentinel Queues server:
</a>
</div>
-## Configure Gitaly
+## Configure Gitaly Cluster
-NOTE:
-[Gitaly Cluster](../gitaly/praefect.md) support
-for the Reference Architectures is being
-worked on as a [collaborative effort](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/1) between the Quality Engineering and Gitaly teams. When this component has been verified
-some Architecture specs will likely change as a result to support the new
-and improved designed.
+[Gitaly Cluster](../gitaly/praefect.md) is a GitLab provided and recommended fault tolerant solution for storing Git repositories.
+In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
+
+The recommended cluster setup includes the following components:
+
+- 3 Gitaly nodes: Replicated storage of Git repositories.
+- 3 Praefect nodes: Router and transaction manager for Gitaly Cluster.
+- 1 Praefect PostgreSQL node: Database server for Praefect. A third-party solution
+ is required for Praefect database connections to be made highly available.
+- 1 load balancer: A load balancer is required for Praefect. The
+ [internal load balancer](#configure-the-internal-load-balancer) will be used.
+
+This section will detail how to configure the recommended standard setup in order.
+For more advanced setups refer to the [standalone Gitaly Cluster documentation](../gitaly/praefect.md).
+
+### Configure Praefect PostgreSQL
+
+Praefect, the routing and transaction manager for Gitaly Cluster, requires its own database server to store data on Gitaly Cluster status.
+
+If you want to have a highly available setup, Praefect requires a third-party PostgreSQL database.
+A built-in solution is being [worked on](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919).
+
+#### Praefect non-HA PostgreSQL standalone using Omnibus GitLab
+
+The following IPs will be used as an example:
+
+- `10.6.0.141`: Praefect PostgreSQL
+
+First, make sure to [install](https://about.gitlab.com/install/)
+the Linux GitLab package in the Praefect PostgreSQL node. Following the steps,
+install the necessary dependencies from step 1, and add the
+GitLab package repository from step 2. When installing GitLab
+in the second step, do not supply the `EXTERNAL_URL` value.
+
+1. SSH in to the Praefect PostgreSQL node.
+1. Create a strong password to be used for the Praefect PostgreSQL user. Take note of this password as `<praefect_postgresql_password>`.
+1. Generate the password hash for the Praefect PostgreSQL username/password pair. This assumes you will use the default
+ username of `praefect` (recommended). The command will request the password `<praefect_postgresql_password>`
+ and confirmation. Use the value that is output by this command in the next
+ step as the value of `<praefect_postgresql_password_hash>`:
+
+ ```shell
+ sudo gitlab-ctl pg-password-md5 praefect
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
+
+ ```ruby
+ # Disable all components except PostgreSQL and Consul
+ roles ['postgres_role']
+ repmgr['enable'] = false
+ patroni['enable'] = false
+
+ # PostgreSQL configuration
+ postgresql['listen_address'] = '0.0.0.0'
+ postgresql['max_connections'] = 200
+
+ gitlab_rails['auto_migrate'] = false
-[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
-specifically the number of projects and those projects' sizes. It's recommended
-that a Gitaly server node stores no more than 5 TB of data. Depending on your
-repository storage requirements, you may require additional Gitaly server nodes.
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ # Replace PRAEFECT_POSTGRESQL_PASSWORD_HASH with a generated md5 value
+ postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>"
+
+ # Replace XXX.XXX.XXX.XXX/YY with Network Address
+ postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ postgres_exporter['listen_address'] = '0.0.0.0:9187'
+
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+ #
+ # END user configuration
+ ```
+
+1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
+1. Follow the [post configuration](#praefect-postgresql-post-configuration).
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
+#### Praefect HA PostgreSQL third-party solution
+
+[As noted](#configure-praefect-postgresql), a third-party PostgreSQL solution for
+Praefect's database is recommended if aiming for full High Availability.
+
+There are many third-party solutions for PostgreSQL HA. The solution selected must have the following to work with Praefect:
+
+- A static IP for all connections that doesn't change on failover.
+- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
+
+Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
+
+Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
+
+#### Praefect PostgreSQL post-configuration
+
+After the Praefect PostgreSQL server has been set up, you'll then need to configure the user and database for Praefect to use.
+
+We recommend the user be named `praefect` and the database `praefect_production`, and these can be configured as standard in PostgreSQL.
+The password for the user is the same as the one you configured earlier as `<praefect_postgresql_password>`.
+
+This is how this would work with a Omnibus GitLab PostgreSQL setup:
+
+1. SSH in to the Praefect PostgreSQL node.
+1. Connect to the PostgreSQL server with administrative access.
+ The `gitlab-psql` user should be used here for this as it's added by default in Omnibus.
+ The database `template1` is used because it is created by default on all PostgreSQL servers.
+
+ ```shell
+ /opt/gitlab/embedded/bin/psql -U gitlab-psql -d template1 -h POSTGRESQL_SERVER_ADDRESS
+ ```
+
+1. Create the new user `praefect`, replacing `<praefect_postgresql_password>`:
+
+ ```shell
+ CREATE ROLE praefect WITH LOGIN CREATEDB PASSWORD <praefect_postgresql_password>;
+ ```
+
+1. Reconnect to the PostgreSQL server, this time as the `praefect` user:
+
+ ```shell
+ /opt/gitlab/embedded/bin/psql -U praefect -d template1 -h POSTGRESQL_SERVER_ADDRESS
+ ```
+
+1. Create a new database `praefect_production`:
+
+ ```shell
+ CREATE DATABASE praefect_production WITH ENCODING=UTF8;
+ ```
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
+### Configure Praefect
+
+Praefect is the router and transaction manager for Gitaly Cluster and all connections to Gitaly go through
+it. This section details how to configure it.
+
+Praefect requires several secret tokens to secure communications across the Cluster:
+
+- `<praefect_external_token>`: Used for repositories hosted on your Gitaly cluster and can only be accessed by Gitaly clients that carry this token.
+- `<praefect_internal_token>`: Used for replication traffic inside your Gitaly cluster. This is distinct from `praefect_external_token` because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.
+- `<praefect_postgresql_password>`: The Praefect PostgreSQL password defined in the previous section is also required as part of this setup.
+
+Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
+the details of each Gitaly node that makes up the cluster. Each storage is also given a name
+and this name is used in several areas of the config. In this guide, the name of the storage will be
+`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
+to use Gitaly Cluster, you may need to use a different name.
+Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
+
+The following IPs will be used as an example:
+
+- `10.6.0.131`: Praefect 1
+- `10.6.0.132`: Praefect 2
+- `10.6.0.133`: Praefect 3
+
+To configure the Praefect nodes, on each one:
+
+1. SSH in to the Praefect server.
+1. [Download and install](https://about.gitlab.com/install/) the Omnibus GitLab
+ package of your choice. Be sure to follow _only_ installation steps 1 and 2
+ on the page.
+1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
+
+ ```ruby
+ # Avoid running unnecessary services on the Gitaly server
+ postgresql['enable'] = false
+ redis['enable'] = false
+ nginx['enable'] = false
+ puma['enable'] = false
+ unicorn['enable'] = false
+ sidekiq['enable'] = false
+ gitlab_workhorse['enable'] = false
+ grafana['enable'] = false
+
+ # If you run a separate monitoring node you can disable these services
+ alertmanager['enable'] = false
+ prometheus['enable'] = false
+
+ # Praefect Configuration
+ praefect['enable'] = true
+ praefect['listen_addr'] = '0.0.0.0:2305'
+
+ gitlab_rails['rake_cache_clear'] = false
+ gitlab_rails['auto_migrate'] = false
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+
+ # Praefect External Token
+ # This is needed by clients outside the cluster (like GitLab Shell) to communicate with the Praefect cluster
+ praefect['auth_token'] = '<praefect_external_token>'
+
+ # Praefect Database Settings
+ praefect['database_host'] = '10.6.0.141'
+ praefect['database_port'] = 5432
+ # `no_proxy` settings must always be a direct connection for caching
+ praefect['database_host_no_proxy'] = '10.6.0.141'
+ praefect['database_port_no_proxy'] = 5432
+ praefect['database_dbname'] = 'praefect_production'
+ praefect['database_user'] = 'praefect'
+ praefect['database_password'] = '<praefect_postgresql_password>'
+
+ # Praefect Virtual Storage config
+ # Name of storage hash must match storage name in git_data_dirs on GitLab
+ # server ('praefect') and in git_data_dirs on Gitaly nodes ('gitaly-1')
+ praefect['virtual_storages'] = {
+ 'default' => {
+ 'gitaly-1' => {
+ 'address' => 'tcp://10.6.0.91:8075',
+ 'token' => '<praefect_internal_token>',
+ 'primary' => true
+ },
+ 'gitaly-2' => {
+ 'address' => 'tcp://10.6.0.92:8075',
+ 'token' => '<praefect_internal_token>'
+ },
+ 'gitaly-3' => {
+ 'address' => 'tcp://10.6.0.93:8075',
+ 'token' => '<praefect_internal_token>'
+ },
+ }
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ praefect['prometheus_listen_addr'] = '0.0.0.0:9652'
+
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+ #
+ # END user configuration
+ ```
+
+ 1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and
+ then replace the file of the same name on this server. If that file isn't on
+ this server, add the file from your Consul server to this server.
+
+ 1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
+
+### Configure Gitaly
+
+The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
+requirements that are dependent on data, specifically the number of projects
+and those projects' sizes. It's recommended that a Gitaly Cluster stores
+no more than 5 TB of data on each node. Depending on your
+repository storage requirements, you may require additional Gitaly Clusters.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -1325,36 +1638,21 @@ adjusted to greater or lesser values depending on the scale of your
environment's workload. If you're running the environment on a Cloud provider,
refer to their documentation about how to configure IOPS correctly.
-Be sure to note the following items:
+Gitaly servers must not be exposed to the public internet, as Gitaly's network
+traffic is unencrypted by default. The use of a firewall is highly recommended
+to restrict access to the Gitaly server. Another option is to
+[use TLS](#gitaly-cluster-tls-support).
-- The GitLab Rails application shards repositories into
- [repository storage paths](../repository_storage_paths.md).
-- A Gitaly server can host one or more storage paths.
-- A GitLab server can use one or more Gitaly server nodes.
-- Gitaly addresses must be specified to be correctly resolvable for all Gitaly
- clients.
-- Gitaly servers must not be exposed to the public internet, as Gitaly's network
- traffic is unencrypted by default. The use of a firewall is highly recommended
- to restrict access to the Gitaly server. Another option is to
- [use TLS](#gitaly-tls-support).
+For configuring Gitaly you should note the following:
-NOTE:
-The token referred to throughout the Gitaly documentation is an arbitrary
-password selected by the administrator. This token is unrelated to tokens
-created for the GitLab API or other similar web API tokens.
-
-This section describes how to configure two Gitaly servers, with the following
-IPs and domain names:
-
-- `10.6.0.91`: Gitaly 1 (`gitaly1.internal`)
-- `10.6.0.92`: Gitaly 2 (`gitaly2.internal`)
+- `git_data_dirs` should be configured to reflect the storage path for the specific Gitaly node
+- `auth_token` should be the same as `praefect_internal_token`
-Assumptions about your servers include having the secret token be `gitalysecret`,
-and that your GitLab installation has three repository storages:
+The following IPs will be used as an example:
-- `default` on Gitaly 1
-- `storage1` on Gitaly 1
-- `storage2` on Gitaly 2
+- `10.6.0.91`: Gitaly 1
+- `10.6.0.92`: Gitaly 2
+- `10.6.0.93`: Gitaly 3
On each node:
@@ -1364,21 +1662,9 @@ On each node:
1. Edit the Gitaly server node's `/etc/gitlab/gitlab.rb` file to configure
storage paths, enable the network listener, and to configure the token:
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
-
```ruby
# /etc/gitlab/gitlab.rb
- # Gitaly and GitLab use two shared secrets for authentication, one to authenticate gRPC requests
- # to Gitaly, and a second for authentication callbacks from GitLab-Shell to the GitLab internal API.
- # The following two values must be the same as their respective values
- # of the GitLab Rails application setup
- gitaly['auth_token'] = 'gitalysecret'
- gitlab_shell['secret_token'] = 'shellsecret'
-
# Avoid running unnecessary services on the Gitaly server
postgresql['enable'] = false
redis['enable'] = false
@@ -1407,36 +1693,42 @@ On each node:
# firewalls to restrict access to this address/port.
# Comment out following line if you only want to support TLS connections
gitaly['listen_addr'] = "0.0.0.0:8075"
+
+ # Gitaly Auth Token
+ # Should be the same as praefect_internal_token
+ gitaly['auth_token'] = '<praefect_internal_token>'
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
- - On `gitaly1.internal`:
+ - On Gitaly node 1:
```ruby
git_data_dirs({
- 'default' => {
- 'path' => '/var/opt/gitlab/git-data'
- },
- 'storage1' => {
- 'path' => '/mnt/gitlab/git-data'
- },
+ "gitaly-1" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
})
```
- - On `gitaly2.internal`:
+ - On Gitaly node 2:
```ruby
git_data_dirs({
- 'storage2' => {
- 'path' => '/mnt/gitlab/git-data'
- },
+ "gitaly-2" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
})
```
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
+ - On Gitaly node 3:
+
+ ```ruby
+ git_data_dirs({
+ "gitaly-3" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
+ })
+ ```
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and
then replace the file of the same name on this server. If that file isn't on
@@ -1444,34 +1736,44 @@ On each node:
1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
-### Gitaly TLS support
+### Gitaly Cluster TLS support
-Gitaly supports TLS encryption. To be able to communicate
-with a Gitaly instance that listens for secure connections you will need to use `tls://` URL
-scheme in the `gitaly_address` of the corresponding storage entry in the GitLab configuration.
+Praefect supports TLS encryption. To communicate with a Praefect instance that listens
+for secure connections, you must:
-You will need to bring your own certificates as this isn't provided automatically.
-The certificate, or its certificate authority, must be installed on all Gitaly
-nodes (including the Gitaly node using the certificate) and on all client nodes
-that communicate with it following the procedure described in
-[GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates).
+- Use a `tls://` URL scheme in the `gitaly_address` of the corresponding storage entry
+ in the GitLab configuration.
+- Bring your own certificates because this isn't provided automatically. The certificate
+ corresponding to each Praefect server must be installed on that Praefect server.
-NOTE:
-The self-signed certificate must specify the address you use to access the
-Gitaly server. If you are addressing the Gitaly server by a hostname, you can
-either use the Common Name field for this, or add it as a Subject Alternative
-Name. If you are addressing the Gitaly server by its IP address, you must add it
-as a Subject Alternative Name to the certificate.
-[gRPC does not support using an IP address as Common Name in a certificate](https://github.com/grpc/grpc/issues/2691).
+Additionally the certificate, or its certificate authority, must be installed on all Gitaly servers
+and on all Praefect clients that communicate with it following the procedure described in
+[GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates) (and repeated below).
-It's possible to configure Gitaly servers with both an unencrypted listening
-address (`listen_addr`) and an encrypted listening address (`tls_listen_addr`)
-at the same time. This allows you to do a gradual transition from unencrypted to
-encrypted traffic, if necessary.
+Note the following:
-To configure Gitaly with TLS:
+- The certificate must specify the address you use to access the Praefect server. If
+ addressing the Praefect server by:
-1. Create the `/etc/gitlab/ssl` directory and copy your key and certificate there:
+ - Hostname, you can either use the Common Name field for this, or add it as a Subject
+ Alternative Name.
+ - IP address, you must add it as a Subject Alternative Name to the certificate.
+
+- You can configure Praefect servers with both an unencrypted listening address
+ `listen_addr` and an encrypted listening address `tls_listen_addr` at the same time.
+ This allows you to do a gradual transition from unencrypted to encrypted traffic, if
+ necessary.
+
+- The Internal Load Balancer will also access to the certificates and need to be configured
+ to allow for TLS passthrough.
+ Refer to the load balancers documentation on how to configure this.
+
+To configure Praefect with TLS:
+
+1. Create certificates for Praefect servers.
+
+1. On the Praefect servers, create the `/etc/gitlab/ssl` directory and copy your key
+ and certificate there:
```shell
sudo mkdir -p /etc/gitlab/ssl
@@ -1480,27 +1782,34 @@ To configure Gitaly with TLS:
sudo chmod 644 key.pem cert.pem
```
-1. Copy the cert to `/etc/gitlab/trusted-certs` so Gitaly will trust the cert when
- calling into itself:
+1. Edit `/etc/gitlab/gitlab.rb` and add:
- ```shell
- sudo cp /etc/gitlab/ssl/cert.pem /etc/gitlab/trusted-certs/
+ ```ruby
+ praefect['tls_listen_addr'] = "0.0.0.0:3305"
+ praefect['certificate_path'] = "/etc/gitlab/ssl/cert.pem"
+ praefect['key_path'] = "/etc/gitlab/ssl/key.pem"
```
-1. Edit `/etc/gitlab/gitlab.rb` and add:
+1. Save the file and [reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure).
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
+1. On the Praefect clients (including each Gitaly server), copy the certificates,
+ or their certificate authority, into `/etc/gitlab/trusted-certs`:
- ```ruby
- gitaly['tls_listen_addr'] = "0.0.0.0:9999"
- gitaly['certificate_path'] = "/etc/gitlab/ssl/cert.pem"
- gitaly['key_path'] = "/etc/gitlab/ssl/key.pem"
+ ```shell
+ sudo cp cert.pem /etc/gitlab/trusted-certs/
```
-1. Delete `gitaly['listen_addr']` to allow only encrypted connections.
+1. On the Praefect clients (except Gitaly servers), edit `git_data_dirs` in
+ `/etc/gitlab/gitlab.rb` as follows:
+
+ ```ruby
+ git_data_dirs({
+ "default" => {
+ "gitaly_address" => 'tls://LOAD_BALANCER_SERVER_ADDRESS:2305',
+ "gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN'
+ }
+ })
+ ```
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
@@ -1587,12 +1896,15 @@ To configure the Sidekiq nodes, on each one:
### Gitaly ###
#######################################
+ # git_data_dirs get configured for the Praefect virtual storage
+ # Address is Internal Load Balancer for Praefect
+ # Token is praefect_external_token
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage1' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage2' => { 'gitaly_address' => 'tcp://gitaly2.internal:8075' },
+ "default" => {
+ "gitaly_address" => "tcp://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
- gitlab_rails['gitaly_token'] = 'YOUR_TOKEN'
#######################################
### Postgres ###
@@ -1624,7 +1936,7 @@ To configure the Sidekiq nodes, on each one:
node_exporter['listen_address'] = '0.0.0.0:9100'
# Rails Status for prometheus
- gitlab_rails['monitoring_whitelist'] = ['10.6.0.121/32', '127.0.0.0/8']
+ gitlab_rails['monitoring_whitelist'] = ['10.6.0.151/32', '127.0.0.0/8']
#############################
### Object storage ###
@@ -1671,6 +1983,15 @@ The following IPs will be used as an example:
- `10.6.0.111`: GitLab application 1
- `10.6.0.112`: GitLab application 2
- `10.6.0.113`: GitLab application 3
+- `10.6.0.114`: GitLab application 4
+- `10.6.0.115`: GitLab application 5
+- `10.6.0.116`: GitLab application 6
+- `10.6.0.117`: GitLab application 7
+- `10.6.0.118`: GitLab application 8
+- `10.6.0.119`: GitLab application 9
+- `10.6.0.120`: GitLab application 10
+- `10.6.0.121`: GitLab application 11
+- `10.6.0.122`: GitLab application 12
On each node perform the following:
@@ -1690,17 +2011,14 @@ On each node perform the following:
```ruby
external_url 'https://gitlab.example.com'
- # Gitaly and GitLab use two shared secrets for authentication, one to authenticate gRPC requests
- # to Gitaly, and a second for authentication callbacks from GitLab-Shell to the GitLab internal API.
- # The following two values must be the same as their respective values
- # of the Gitaly setup
- gitlab_rails['gitaly_token'] = 'gitalysecret'
- gitlab_shell['secret_token'] = 'shellsecret'
-
+ # git_data_dirs get configured for the Praefect virtual storage
+ # Address is Interal Load Balancer for Praefect
+ # Token is praefect_external_token
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage1' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage2' => { 'gitaly_address' => 'tcp://gitaly2.internal:8075' },
+ "default" => {
+ "gitaly_address" => "tcp://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
## Disable components that will not be on the GitLab application server
@@ -1755,8 +2073,8 @@ On each node perform the following:
# Add the monitoring node's IP address to the monitoring whitelist and allow it to
# scrape the NGINX metrics
- gitlab_rails['monitoring_whitelist'] = ['10.6.0.121/32', '127.0.0.0/8']
- nginx['status']['options']['allow'] = ['10.6.0.121/32', '127.0.0.0/8']
+ gitlab_rails['monitoring_whitelist'] = ['10.6.0.151/32', '127.0.0.0/8']
+ nginx['status']['options']['allow'] = ['10.6.0.151/32', '127.0.0.0/8']
#############################
### Object storage ###
@@ -1779,14 +2097,15 @@ On each node perform the following:
```
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
-1. If you're using [Gitaly with TLS support](#gitaly-tls-support), make sure the
+1. If you're using [Gitaly with TLS support](#gitaly-cluster-tls-support), make sure the
`git_data_dirs` entry is configured with `tls` instead of `tcp`:
```ruby
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tls://gitaly1.internal:9999' },
- 'storage1' => { 'gitaly_address' => 'tls://gitaly1.internal:9999' },
- 'storage2' => { 'gitaly_address' => 'tls://gitaly2.internal:9999' },
+ "default" => {
+ "gitaly_address" => "tls://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
```
@@ -1891,7 +2210,7 @@ running [Prometheus](../monitoring/prometheus/index.md) and
The following IP will be used as an example:
-- `10.6.0.121`: Prometheus
+- `10.6.0.151`: Prometheus
To configure the Monitoring node:
diff --git a/doc/administration/reference_architectures/5k_users.md b/doc/administration/reference_architectures/5k_users.md
index 37e67b0ab73..fb9fdfcc3f1 100644
--- a/doc/administration/reference_architectures/5k_users.md
+++ b/doc/administration/reference_architectures/5k_users.md
@@ -19,79 +19,107 @@ costly-to-operate environment by using the
[2,000-user reference architecture](2k_users.md).
> - **Supported users (approximate):** 5,000
-> - **High Availability:** Yes
+> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Test requests per second (RPS) rates:** API: 100 RPS, Web: 10 RPS, Git (Pull): 10 RPS, Git (Push): 2 RPS
| Service | Nodes | Configuration | GCP | AWS | Azure |
|--------------------------------------------|-------------|-------------------------|----------------|-------------|----------|
-| External load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Redis | 3 | 2 vCPU, 7.5 GB memory | n1-standard-2 | `m5.large` | D2s v3 |
-| Consul + Sentinel | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| PostgreSQL | 3 | 2 vCPU, 7.5 GB memory | n1-standard-2 | `m5.large` | D2s v3 |
-| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Internal load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
-| Gitaly | 2 (minimum) | 8 vCPU, 30 GB memory | n1-standard-8 | `m5.2xlarge` | D8s v3 |
-| Sidekiq | 4 | 2 vCPU, 7.5 GB memory | n1-standard-2 | `m5.large` | D2s v3 |
-| GitLab Rails | 3 | 16 vCPU, 14.4 GB memory | n1-highcpu-16 | `c5.4xlarge` | F16s v2 |
-| Monitoring node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | `c5.large` | F2s v2 |
+| External load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Redis | 3 | 2 vCPU, 7.5 GB memory | n1-standard-2 | m5.large | D2s v3 |
+| Consul + Sentinel | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| PostgreSQL | 3 | 4 vCPU, 15 GB memory | n1-standard-4 | m5.xlarge | D4s v3 |
+| PgBouncer | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Internal load balancing node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Gitaly | 3 | 8 vCPU, 30 GB memory | n1-standard-8 | m5.2xlarge | D8s v3 |
+| Praefect | 3 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Praefect PostgreSQL | 1+* | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
+| Sidekiq | 4 | 2 vCPU, 7.5 GB memory | n1-standard-2 | m5.large | D2s v3 |
+| GitLab Rails | 3 | 16 vCPU, 14.4 GB memory | n1-highcpu-16 | c5.4xlarge | F16s v2 |
+| Monitoring node | 1 | 2 vCPU, 1.8 GB memory | n1-highcpu-2 | c5.large | F2s v2 |
| Object storage | n/a | n/a | n/a | n/a | n/a |
-| NFS server (optional, not recommended) | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | `c5.xlarge` | F4s v2 |
-
-```mermaid
-stateDiagram-v2
- [*] --> LoadBalancer
- LoadBalancer --> ApplicationServer
-
- ApplicationServer --> BackgroundJobs
- ApplicationServer --> Gitaly
- ApplicationServer --> Redis
- ApplicationServer --> PgBouncer
- PgBouncer --> Database
- ApplicationServer --> ObjectStorage
- BackgroundJobs --> ObjectStorage
-
- ApplicationMonitoring -->ApplicationServer
- ApplicationMonitoring -->Redis
- ApplicationMonitoring -->PgBouncer
- ApplicationMonitoring -->Database
- ApplicationMonitoring -->BackgroundJobs
-
- state Database {
- "PG_Primary_Node"
- "PG_Secondary_Node_1..2"
- }
-
- state Redis {
- "R_Primary_Node"
- "R_Replica_Node_1..2"
- "R_Consul/Sentinel_1..3"
- }
-
- state Gitaly {
- "Gitaly_1..2"
- }
-
- state BackgroundJobs {
- "Sidekiq_1..4"
- }
-
- state ApplicationServer {
- "GitLab_Rails_1..3"
- }
-
- state LoadBalancer {
- "LoadBalancer_1"
- }
-
- state ApplicationMonitoring {
- "Prometheus"
- "Grafana"
- }
-
- state PgBouncer {
- "Internal_Load_Balancer"
- "PgBouncer_1..3"
- }
+| NFS server (optional, not recommended) | 1 | 4 vCPU, 3.6 GB memory | n1-highcpu-4 | c5.xlarge | F4s v2 |
+
+```plantuml
+@startuml 5k
+card "**External Load Balancer**" as elb #6a9be7
+card "**Internal Load Balancer**" as ilb #9370DB
+
+together {
+ collections "**GitLab Rails** x3" as gitlab #32CD32
+ collections "**Sidekiq** x4" as sidekiq #ff8dd1
+}
+
+together {
+ card "**Prometheus + Grafana**" as monitor #7FFFD4
+ collections "**Consul** x3" as consul #e76a9b
+}
+
+card "Gitaly Cluster" as gitaly_cluster {
+ collections "**Praefect** x3" as praefect #FF8C00
+ collections "**Gitaly** x3" as gitaly #FF8C00
+ card "**Praefect PostgreSQL***\n//Non fault-tolerant//" as praefect_postgres #FF8C00
+
+ praefect -[#FF8C00]-> gitaly
+ praefect -[#FF8C00]> praefect_postgres
+}
+
+card "Database" as database {
+ collections "**PGBouncer** x3" as pgbouncer #4EA7FF
+ card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+
+ pgbouncer -[#4EA7FF]-> postgres_primary
+ postgres_primary .[#4EA7FF]> postgres_secondary
+}
+
+card "redis" as redis {
+ collections "**Redis Persistent** x3" as redis_persistent #FF6347
+ collections "**Redis Cache** x3" as redis_cache #FF6347
+ collections "**Redis Persistent Sentinel** x3" as redis_persistent_sentinel #FF6347
+ collections "**Redis Cache Sentinel** x3"as redis_cache_sentinel #FF6347
+
+ redis_persistent <.[#FF6347]- redis_persistent_sentinel
+ redis_cache <.[#FF6347]- redis_cache_sentinel
+}
+
+cloud "**Object Storage**" as object_storage #white
+
+elb -[#6a9be7]-> gitlab
+elb -[#6a9be7]--> monitor
+
+gitlab -[#32CD32]> sidekiq
+gitlab -[#32CD32]--> ilb
+gitlab -[#32CD32]-> object_storage
+gitlab -[#32CD32]---> redis
+gitlab -[hidden]-> monitor
+gitlab -[hidden]-> consul
+
+sidekiq -[#ff8dd1]--> ilb
+sidekiq -[#ff8dd1]-> object_storage
+sidekiq -[#ff8dd1]---> redis
+sidekiq -[hidden]-> monitor
+sidekiq -[hidden]-> consul
+
+ilb -[#9370DB]-> gitaly_cluster
+ilb -[#9370DB]-> database
+
+consul .[#e76a9b]u-> gitlab
+consul .[#e76a9b]u-> sidekiq
+consul .[#e76a9b]> monitor
+consul .[#e76a9b]-> database
+consul .[#e76a9b]-> gitaly_cluster
+consul .[#e76a9b,norank]--> redis
+
+monitor .[#7FFFD4]u-> gitlab
+monitor .[#7FFFD4]u-> sidekiq
+monitor .[#7FFFD4]> consul
+monitor .[#7FFFD4]-> database
+monitor .[#7FFFD4]-> gitaly_cluster
+monitor .[#7FFFD4,norank]--> redis
+monitor .[#7FFFD4]> ilb
+monitor .[#7FFFD4,norank]u--> elb
+
+@enduml
```
The Google Cloud Platform (GCP) architectures were built and tested using the
@@ -106,19 +134,25 @@ uploads, or artifacts), using an [object storage service](#configure-the-object-
is recommended instead of using NFS. Using an object storage service also
doesn't require you to provision and maintain a node.
+It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+that to achieve full High Availability a third party PostgreSQL database solution will be required.
+We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+
## Setup components
To set up GitLab and its components to accommodate up to 5,000 users:
-1. [Configure the external load balancing node](#configure-the-external-load-balancer)
+1. [Configure the external load balancer](#configure-the-external-load-balancer)
to handle the load balancing of the GitLab application services nodes.
+1. [Configure the internal load balancer](#configure-the-internal-load-balancer).
+ to handle the load balancing of GitLab application internal connections.
1. [Configure Redis](#configure-redis).
1. [Configure Consul and Sentinel](#configure-consul-and-sentinel).
1. [Configure PostgreSQL](#configure-postgresql), the database for GitLab.
1. [Configure PgBouncer](#configure-pgbouncer).
-1. [Configure the internal load balancing node](#configure-the-internal-load-balancer).
-1. [Configure Gitaly](#configure-gitaly),
- which provides access to the Git repositories.
+1. [Configure Gitaly Cluster](#configure-gitaly-cluster),
+ provides access to the Git repositories.
1. [Configure Sidekiq](#configure-sidekiq).
1. [Configure the main GitLab Rails application](#configure-gitlab-rails)
to run Puma/Unicorn, Workhorse, GitLab Shell, and to serve all frontend
@@ -155,6 +189,11 @@ The following list includes descriptions of each server and its assigned IP:
- `10.6.0.20`: Internal Load Balancer
- `10.6.0.51`: Gitaly 1
- `10.6.0.52`: Gitaly 2
+- `10.6.0.93`: Gitaly 3
+- `10.6.0.131`: Praefect 1
+- `10.6.0.132`: Praefect 2
+- `10.6.0.133`: Praefect 3
+- `10.6.0.141`: Praefect PostgreSQL 1 (non HA)
- `10.6.0.71`: Sidekiq 1
- `10.6.0.72`: Sidekiq 2
- `10.6.0.73`: Sidekiq 3
@@ -285,6 +324,71 @@ Configure DNS for an alternate SSH hostname such as `altssh.gitlab.example.com`.
</a>
</div>
+## Configure the internal load balancer
+
+The Internal Load Balancer is used to balance any internal connections the GitLab environment requires
+such as connections to [PgBouncer](#configure-pgbouncer) and [Praefect](#configure-praefect) (Gitaly Cluster).
+
+Note that it's a separate node from the External Load Balancer and shouldn't have any access externally.
+
+The following IP will be used as an example:
+
+- `10.6.0.40`: Internal Load Balancer
+
+Here's how you could do it with [HAProxy](https://www.haproxy.org/):
+
+```plaintext
+global
+ log /dev/log local0
+ log localhost local1 notice
+ log stdout format raw local0
+
+defaults
+ log global
+ default-server inter 10s fall 3 rise 2
+ balance leastconn
+
+frontend internal-pgbouncer-tcp-in
+ bind *:6432
+ mode tcp
+ option tcplog
+
+ default_backend pgbouncer
+
+frontend internal-praefect-tcp-in
+ bind *:2305
+ mode tcp
+ option tcplog
+ option clitcpka
+
+ default_backend praefect
+
+backend pgbouncer
+ mode tcp
+ option tcp-check
+
+ server pgbouncer1 10.6.0.21:6432 check
+ server pgbouncer2 10.6.0.22:6432 check
+ server pgbouncer3 10.6.0.23:6432 check
+
+backend praefect
+ mode tcp
+ option tcp-check
+ option srvtcpka
+
+ server praefect1 10.6.0.131:2305 check
+ server praefect2 10.6.0.132:2305 check
+ server praefect3 10.6.0.133:2305 check
+```
+
+Refer to your preferred Load Balancer's documentation for further guidance.
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
## Configure Redis
Using [Redis](https://redis.io/) in scalable environment is possible using a **Primary** x **Replica**
@@ -924,45 +1028,96 @@ The following IPs will be used as an example:
</a>
</div>
-### Configure the internal load balancer
+## Configure Gitaly Cluster
-If you're running more than one PgBouncer node as recommended, then at this time you'll need to set
-up a TCP internal load balancer to serve each correctly.
+[Gitaly Cluster](../gitaly/praefect.md) is a GitLab provided and recommended fault tolerant solution for storing Git repositories.
+In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
-The following IP will be used as an example:
+The recommended cluster setup includes the following components:
-- `10.6.0.20`: Internal Load Balancer
+- 3 Gitaly nodes: Replicated storage of Git repositories.
+- 3 Praefect nodes: Router and transaction manager for Gitaly Cluster.
+- 1 Praefect PostgreSQL node: Database server for Praefect. A third-party solution
+ is required for Praefect database connections to be made highly available.
+- 1 load balancer: A load balancer is required for Praefect. The
+ [internal load balancer](#configure-the-internal-load-balancer) will be used.
-Here's how you could do it with [HAProxy](https://www.haproxy.org/):
+This section will detail how to configure the recommended standard setup in order.
+For more advanced setups refer to the [standalone Gitaly Cluster documentation](../gitaly/praefect.md).
-```plaintext
-global
- log /dev/log local0
- log localhost local1 notice
- log stdout format raw local0
+### Configure Praefect PostgreSQL
-defaults
- log global
- default-server inter 10s fall 3 rise 2
- balance leastconn
+Praefect, the routing and transaction manager for Gitaly Cluster, requires its own database server to store data on Gitaly Cluster status.
-frontend internal-pgbouncer-tcp-in
- bind *:6432
- mode tcp
- option tcplog
+If you want to have a highly available setup, Praefect requires a third-party PostgreSQL database.
+A built-in solution is being [worked on](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919).
- default_backend pgbouncer
+#### Praefect non-HA PostgreSQL standalone using Omnibus GitLab
-backend pgbouncer
- mode tcp
- option tcp-check
+The following IPs will be used as an example:
- server pgbouncer1 10.6.0.21:6432 check
- server pgbouncer2 10.6.0.22:6432 check
- server pgbouncer3 10.6.0.23:6432 check
-```
+- `10.6.0.141`: Praefect PostgreSQL
-Refer to your preferred Load Balancer's documentation for further guidance.
+First, make sure to [install](https://about.gitlab.com/install/)
+the Linux GitLab package in the Praefect PostgreSQL node. Following the steps,
+install the necessary dependencies from step 1, and add the
+GitLab package repository from step 2. When installing GitLab
+in the second step, do not supply the `EXTERNAL_URL` value.
+
+1. SSH in to the Praefect PostgreSQL node.
+1. Create a strong password to be used for the Praefect PostgreSQL user. Take note of this password as `<praefect_postgresql_password>`.
+1. Generate the password hash for the Praefect PostgreSQL username/password pair. This assumes you will use the default
+ username of `praefect` (recommended). The command will request the password `<praefect_postgresql_password>`
+ and confirmation. Use the value that is output by this command in the next
+ step as the value of `<praefect_postgresql_password_hash>`:
+
+ ```shell
+ sudo gitlab-ctl pg-password-md5 praefect
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
+
+ ```ruby
+ # Disable all components except PostgreSQL and Consul
+ roles ['postgres_role']
+ repmgr['enable'] = false
+ patroni['enable'] = false
+
+ # PostgreSQL configuration
+ postgresql['listen_address'] = '0.0.0.0'
+ postgresql['max_connections'] = 200
+
+ gitlab_rails['auto_migrate'] = false
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ # Replace PRAEFECT_POSTGRESQL_PASSWORD_HASH with a generated md5 value
+ postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>"
+
+ # Replace XXX.XXX.XXX.XXX/YY with Network Address
+ postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ postgres_exporter['listen_address'] = '0.0.0.0:9187'
+
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+ #
+ # END user configuration
+ ```
+
+1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
+1. Follow the [post configuration](#praefect-postgresql-post-configuration).
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
@@ -970,19 +1125,186 @@ Refer to your preferred Load Balancer's documentation for further guidance.
</a>
</div>
-## Configure Gitaly
+#### Praefect HA PostgreSQL third-party solution
-NOTE:
-[Gitaly Cluster](../gitaly/praefect.md) support
-for the Reference Architectures is being
-worked on as a [collaborative effort](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/1) between the Quality Engineering and Gitaly teams. When this component has been verified
-some Architecture specs will likely change as a result to support the new
-and improved designed.
+[As noted](#configure-praefect-postgresql), a third-party PostgreSQL solution for
+Praefect's database is recommended if aiming for full High Availability.
+
+There are many third-party solutions for PostgreSQL HA. The solution selected must have the following to work with Praefect:
+
+- A static IP for all connections that doesn't change on failover.
+- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
+
+Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
+
+Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
+
+#### Praefect PostgreSQL post-configuration
+
+After the Praefect PostgreSQL server has been set up, you'll then need to configure the user and database for Praefect to use.
+
+We recommend the user be named `praefect` and the database `praefect_production`, and these can be configured as standard in PostgreSQL.
+The password for the user is the same as the one you configured earlier as `<praefect_postgresql_password>`.
+
+This is how this would work with a Omnibus GitLab PostgreSQL setup:
+
+1. SSH in to the Praefect PostgreSQL node.
+1. Connect to the PostgreSQL server with administrative access.
+ The `gitlab-psql` user should be used here for this as it's added by default in Omnibus.
+ The database `template1` is used because it is created by default on all PostgreSQL servers.
-[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
-specifically the number of projects and those projects' sizes. It's recommended
-that a Gitaly server node stores no more than 5 TB of data. Depending on your
-repository storage requirements, you may require additional Gitaly server nodes.
+ ```shell
+ /opt/gitlab/embedded/bin/psql -U gitlab-psql -d template1 -h POSTGRESQL_SERVER_ADDRESS
+ ```
+
+1. Create the new user `praefect`, replacing `<praefect_postgresql_password>`:
+
+ ```shell
+ CREATE ROLE praefect WITH LOGIN CREATEDB PASSWORD <praefect_postgresql_password>;
+ ```
+
+1. Reconnect to the PostgreSQL server, this time as the `praefect` user:
+
+ ```shell
+ /opt/gitlab/embedded/bin/psql -U praefect -d template1 -h POSTGRESQL_SERVER_ADDRESS
+ ```
+
+1. Create a new database `praefect_production`:
+
+ ```shell
+ CREATE DATABASE praefect_production WITH ENCODING=UTF8;
+ ```
+
+<div align="right">
+ <a type="button" class="btn btn-default" href="#setup-components">
+ Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
+ </a>
+</div>
+
+### Configure Praefect
+
+Praefect is the router and transaction manager for Gitaly Cluster and all connections to Gitaly go through
+it. This section details how to configure it.
+
+Praefect requires several secret tokens to secure communications across the Cluster:
+
+- `<praefect_external_token>`: Used for repositories hosted on your Gitaly cluster and can only be accessed by Gitaly clients that carry this token.
+- `<praefect_internal_token>`: Used for replication traffic inside your Gitaly cluster. This is distinct from `praefect_external_token` because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.
+- `<praefect_postgresql_password>`: The Praefect PostgreSQL password defined in the previous section is also required as part of this setup.
+
+Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
+the details of each Gitaly node that makes up the cluster. Each storage is also given a name
+and this name is used in several areas of the config. In this guide, the name of the storage will be
+`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
+to use Gitaly Cluster, you may need to use a different name.
+Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
+
+The following IPs will be used as an example:
+
+- `10.6.0.131`: Praefect 1
+- `10.6.0.132`: Praefect 2
+- `10.6.0.133`: Praefect 3
+
+To configure the Praefect nodes, on each one:
+
+1. SSH in to the Praefect server.
+1. [Download and install](https://about.gitlab.com/install/) the Omnibus GitLab
+ package of your choice. Be sure to follow _only_ installation steps 1 and 2
+ on the page.
+1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
+
+ ```ruby
+ # Avoid running unnecessary services on the Gitaly server
+ postgresql['enable'] = false
+ redis['enable'] = false
+ nginx['enable'] = false
+ puma['enable'] = false
+ unicorn['enable'] = false
+ sidekiq['enable'] = false
+ gitlab_workhorse['enable'] = false
+ grafana['enable'] = false
+
+ # If you run a separate monitoring node you can disable these services
+ alertmanager['enable'] = false
+ prometheus['enable'] = false
+
+ # Praefect Configuration
+ praefect['enable'] = true
+ praefect['listen_addr'] = '0.0.0.0:2305'
+
+ gitlab_rails['rake_cache_clear'] = false
+ gitlab_rails['auto_migrate'] = false
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+
+ # Praefect External Token
+ # This is needed by clients outside the cluster (like GitLab Shell) to communicate with the Praefect cluster
+ praefect['auth_token'] = '<praefect_external_token>'
+
+ # Praefect Database Settings
+ praefect['database_host'] = '10.6.0.141'
+ praefect['database_port'] = 5432
+ # `no_proxy` settings must always be a direct connection for caching
+ praefect['database_host_no_proxy'] = '10.6.0.141'
+ praefect['database_port_no_proxy'] = 5432
+ praefect['database_dbname'] = 'praefect_production'
+ praefect['database_user'] = 'praefect'
+ praefect['database_password'] = '<praefect_postgresql_password>'
+
+ # Praefect Virtual Storage config
+ # Name of storage hash must match storage name in git_data_dirs on GitLab
+ # server ('praefect') and in git_data_dirs on Gitaly nodes ('gitaly-1')
+ praefect['virtual_storages'] = {
+ 'default' => {
+ 'gitaly-1' => {
+ 'address' => 'tcp://10.6.0.91:8075',
+ 'token' => '<praefect_internal_token>',
+ 'primary' => true
+ },
+ 'gitaly-2' => {
+ 'address' => 'tcp://10.6.0.92:8075',
+ 'token' => '<praefect_internal_token>'
+ },
+ 'gitaly-3' => {
+ 'address' => 'tcp://10.6.0.93:8075',
+ 'token' => '<praefect_internal_token>'
+ },
+ }
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ praefect['prometheus_listen_addr'] = '0.0.0.0:9652'
+
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+ #
+ # END user configuration
+ ```
+
+ 1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and
+ then replace the file of the same name on this server. If that file isn't on
+ this server, add the file from your Consul server to this server.
+
+ 1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
+
+### Configure Gitaly
+
+The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
+requirements that are dependent on data, specifically the number of projects
+and those projects' sizes. It's recommended that a Gitaly Cluster stores
+no more than 5 TB of data on each node. Depending on your
+repository storage requirements, you may require additional Gitaly Clusters.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -993,36 +1315,21 @@ adjusted to greater or lesser values depending on the scale of your
environment's workload. If you're running the environment on a Cloud provider,
refer to their documentation about how to configure IOPS correctly.
-Be sure to note the following items:
+Gitaly servers must not be exposed to the public internet, as Gitaly's network
+traffic is unencrypted by default. The use of a firewall is highly recommended
+to restrict access to the Gitaly server. Another option is to
+[use TLS](#gitaly-cluster-tls-support).
-- The GitLab Rails application shards repositories into
- [repository storage paths](../repository_storage_paths.md).
-- A Gitaly server can host one or more storage paths.
-- A GitLab server can use one or more Gitaly server nodes.
-- Gitaly addresses must be specified to be correctly resolvable for all Gitaly
- clients.
-- Gitaly servers must not be exposed to the public internet, as Gitaly's network
- traffic is unencrypted by default. The use of a firewall is highly recommended
- to restrict access to the Gitaly server. Another option is to
- [use TLS](#gitaly-tls-support).
-
-NOTE:
-The token referred to throughout the Gitaly documentation is an arbitrary
-password selected by the administrator. This token is unrelated to tokens
-created for the GitLab API or other similar web API tokens.
+For configuring Gitaly you should note the following:
-This section describes how to configure two Gitaly servers, with the following
-IPs and domain names:
+- `git_data_dirs` should be configured to reflect the storage path for the specific Gitaly node
+- `auth_token` should be the same as `praefect_internal_token`
-- `10.6.0.51`: Gitaly 1 (`gitaly1.internal`)
-- `10.6.0.52`: Gitaly 2 (`gitaly2.internal`)
-
-Assumptions about your servers include having the secret token be `gitalysecret`,
-and that your GitLab installation has three repository storages:
+The following IPs will be used as an example:
-- `default` on Gitaly 1
-- `storage1` on Gitaly 1
-- `storage2` on Gitaly 2
+- `10.6.0.91`: Gitaly 1
+- `10.6.0.92`: Gitaly 2
+- `10.6.0.93`: Gitaly 3
On each node:
@@ -1032,21 +1339,9 @@ On each node:
1. Edit the Gitaly server node's `/etc/gitlab/gitlab.rb` file to configure
storage paths, enable the network listener, and to configure the token:
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
-
```ruby
# /etc/gitlab/gitlab.rb
- # Gitaly and GitLab use two shared secrets for authentication, one to authenticate gRPC requests
- # to Gitaly, and a second for authentication callbacks from GitLab-Shell to the GitLab internal API.
- # The following two values must be the same as their respective values
- # of the GitLab Rails application setup
- gitaly['auth_token'] = 'gitalysecret'
- gitlab_shell['secret_token'] = 'shellsecret'
-
# Avoid running unnecessary services on the Gitaly server
postgresql['enable'] = false
redis['enable'] = false
@@ -1056,7 +1351,6 @@ On each node:
sidekiq['enable'] = false
gitlab_workhorse['enable'] = false
grafana['enable'] = false
- gitlab_exporter['enable'] = false
# If you run a separate monitoring node you can disable these services
alertmanager['enable'] = false
@@ -1077,101 +1371,86 @@ On each node:
# Comment out following line if you only want to support TLS connections
gitaly['listen_addr'] = "0.0.0.0:8075"
- ## Enable service discovery for Prometheus
- consul['enable'] = true
- consul['monitoring_service_discovery'] = true
-
- # Set the network addresses that the exporters will listen on for monitoring
- gitaly['prometheus_listen_addr'] = "0.0.0.0:9236"
- node_exporter['listen_address'] = '0.0.0.0:9100'
- gitlab_rails['prometheus_address'] = '10.6.0.81:9090'
-
- ## The IPs of the Consul server nodes
- ## You can also use FQDNs and intermix them with IPs
- consul['configuration'] = {
- retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
- }
+ # Gitaly Auth Token
+ # Should be the same as praefect_internal_token
+ gitaly['auth_token'] = '<praefect_internal_token>'
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
- - On `gitaly1.internal`:
+ - On Gitaly node 1:
```ruby
git_data_dirs({
- 'default' => {
- 'path' => '/var/opt/gitlab/git-data'
- },
- 'storage1' => {
- 'path' => '/mnt/gitlab/git-data'
- },
+ "gitaly-1" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
})
```
- - On `gitaly2.internal`:
+ - On Gitaly node 2:
```ruby
git_data_dirs({
- 'storage2' => {
- 'path' => '/mnt/gitlab/git-data'
- },
+ "gitaly-2" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
})
```
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
+ - On Gitaly node 3:
+
+ ```ruby
+ git_data_dirs({
+ "gitaly-3" => {
+ "path" => "/var/opt/gitlab/git-data"
+ }
+ })
+ ```
+
+1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and
+ then replace the file of the same name on this server. If that file isn't on
+ this server, add the file from your Consul server to this server.
1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
-1. Confirm that Gitaly can perform callbacks to the internal API:
- ```shell
- sudo /opt/gitlab/embedded/bin/gitaly-hooks check /var/opt/gitlab/gitaly/config.toml
- ```
+### Gitaly Cluster TLS support
-1. Verify the GitLab services are running:
+Praefect supports TLS encryption. To communicate with a Praefect instance that listens
+for secure connections, you must:
- ```shell
- sudo gitlab-ctl status
- ```
+- Use a `tls://` URL scheme in the `gitaly_address` of the corresponding storage entry
+ in the GitLab configuration.
+- Bring your own certificates because this isn't provided automatically. The certificate
+ corresponding to each Praefect server must be installed on that Praefect server.
- The output should be similar to the following:
+Additionally the certificate, or its certificate authority, must be installed on all Gitaly servers
+and on all Praefect clients that communicate with it following the procedure described in
+[GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates) (and repeated below).
- ```plaintext
- run: consul: (pid 30339) 77006s; run: log: (pid 29878) 77020s
- run: gitaly: (pid 30351) 77005s; run: log: (pid 29660) 77040s
- run: logrotate: (pid 7760) 3213s; run: log: (pid 29782) 77032s
- run: node-exporter: (pid 30378) 77004s; run: log: (pid 29812) 77026s
- ```
+Note the following:
-### Gitaly TLS support
+- The certificate must specify the address you use to access the Praefect server. If
+ addressing the Praefect server by:
-Gitaly supports TLS encryption. To be able to communicate
-with a Gitaly instance that listens for secure connections you will need to use `tls://` URL
-scheme in the `gitaly_address` of the corresponding storage entry in the GitLab configuration.
+ - Hostname, you can either use the Common Name field for this, or add it as a Subject
+ Alternative Name.
+ - IP address, you must add it as a Subject Alternative Name to the certificate.
-You will need to bring your own certificates as this isn't provided automatically.
-The certificate, or its certificate authority, must be installed on all Gitaly
-nodes (including the Gitaly node using the certificate) and on all client nodes
-that communicate with it following the procedure described in
-[GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates).
+- You can configure Praefect servers with both an unencrypted listening address
+ `listen_addr` and an encrypted listening address `tls_listen_addr` at the same time.
+ This allows you to do a gradual transition from unencrypted to encrypted traffic, if
+ necessary.
-NOTE:
-The self-signed certificate must specify the address you use to access the
-Gitaly server. If you are addressing the Gitaly server by a hostname, you can
-either use the Common Name field for this, or add it as a Subject Alternative
-Name. If you are addressing the Gitaly server by its IP address, you must add it
-as a Subject Alternative Name to the certificate.
-[gRPC does not support using an IP address as Common Name in a certificate](https://github.com/grpc/grpc/issues/2691).
+- The Internal Load Balancer will also access to the certificates and need to be configured
+ to allow for TLS passthrough.
+ Refer to the load balancers documentation on how to configure this.
-It's possible to configure Gitaly servers with both an unencrypted listening
-address (`listen_addr`) and an encrypted listening address (`tls_listen_addr`)
-at the same time. This allows you to do a gradual transition from unencrypted to
-encrypted traffic, if necessary.
+To configure Praefect with TLS:
-To configure Gitaly with TLS:
+1. Create certificates for Praefect servers.
-1. Create the `/etc/gitlab/ssl` directory and copy your key and certificate there:
+1. On the Praefect servers, create the `/etc/gitlab/ssl` directory and copy your key
+ and certificate there:
```shell
sudo mkdir -p /etc/gitlab/ssl
@@ -1180,27 +1459,35 @@ To configure Gitaly with TLS:
sudo chmod 644 key.pem cert.pem
```
-1. Copy the cert to `/etc/gitlab/trusted-certs` so Gitaly will trust the cert when
- calling into itself:
+1. Edit `/etc/gitlab/gitlab.rb` and add:
- ```shell
- sudo cp /etc/gitlab/ssl/cert.pem /etc/gitlab/trusted-certs/
+ ```ruby
+ praefect['tls_listen_addr'] = "0.0.0.0:3305"
+ praefect['certificate_path'] = "/etc/gitlab/ssl/cert.pem"
+ praefect['key_path'] = "/etc/gitlab/ssl/key.pem"
```
-1. Edit `/etc/gitlab/gitlab.rb` and add:
+1. Save the file and [reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure).
- <!--
- updates to following example must also be made at
- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
- -->
+1. On the Praefect clients (including each Gitaly server), copy the certificates,
+ or their certificate authority, into `/etc/gitlab/trusted-certs`:
+
+ ```shell
+ sudo cp cert.pem /etc/gitlab/trusted-certs/
+ ```
+
+1. On the Praefect clients (except Gitaly servers), edit `git_data_dirs` in
+ `/etc/gitlab/gitlab.rb` as follows:
```ruby
- gitaly['tls_listen_addr'] = "0.0.0.0:9999"
- gitaly['certificate_path'] = "/etc/gitlab/ssl/cert.pem"
- gitaly['key_path'] = "/etc/gitlab/ssl/key.pem"
+ git_data_dirs({
+ "default" => {
+ "gitaly_address" => 'tls://LOAD_BALANCER_SERVER_ADDRESS:2305',
+ "gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN'
+ }
+ })
```
-1. Delete `gitaly['listen_addr']` to allow only encrypted connections.
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
<div align="right">
@@ -1269,17 +1556,20 @@ To configure the Sidekiq nodes, one each one:
### Gitaly ###
#######################################
+ # git_data_dirs get configured for the Praefect virtual storage
+ # Address is Internal Load Balancer for Praefect
+ # Token is praefect_external_token
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage1' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage2' => { 'gitaly_address' => 'tcp://gitaly2.internal:8075' },
+ "default" => {
+ "gitaly_address" => "tcp://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
- gitlab_rails['gitaly_token'] = 'YOUR_TOKEN'
#######################################
### Postgres ###
#######################################
- gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
+ gitlab_rails['db_host'] = '10.6.0.40' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
gitlab_rails['db_adapter'] = 'postgresql'
@@ -1397,17 +1687,14 @@ On each node perform the following:
```ruby
external_url 'https://gitlab.example.com'
- # Gitaly and GitLab use two shared secrets for authentication, one to authenticate gRPC requests
- # to Gitaly, and a second for authentication callbacks from GitLab-Shell to the GitLab internal API.
- # The following two values must be the same as their respective values
- # of the Gitaly setup
- gitlab_rails['gitaly_token'] = 'gitalysecret'
- gitlab_shell['secret_token'] = 'shellsecret'
-
+ # git_data_dirs get configured for the Praefect virtual storage
+ # Address is Interal Load Balancer for Praefect
+ # Token is praefect_external_token
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage1' => { 'gitaly_address' => 'tcp://gitaly1.internal:8075' },
- 'storage2' => { 'gitaly_address' => 'tcp://gitaly2.internal:8075' },
+ "default" => {
+ "gitaly_address" => "tcp://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
## Disable components that will not be on the GitLab application server
@@ -1495,14 +1782,15 @@ On each node perform the following:
#registry['gid'] = 9002
```
-1. If you're using [Gitaly with TLS support](#gitaly-tls-support), make sure the
+1. If you're using [Gitaly with TLS support](#gitaly-cluster-tls-support), make sure the
`git_data_dirs` entry is configured with `tls` instead of `tcp`:
```ruby
git_data_dirs({
- 'default' => { 'gitaly_address' => 'tls://gitaly1.internal:9999' },
- 'storage1' => { 'gitaly_address' => 'tls://gitaly1.internal:9999' },
- 'storage2' => { 'gitaly_address' => 'tls://gitaly2.internal:9999' },
+ "default" => {
+ "gitaly_address" => "tls://10.6.0.40:2305", # internal load balancer IP
+ "gitaly_token" => '<praefect_external_token>'
+ }
})
```
diff --git a/doc/api/graphql/reference/gitlab_schema.json b/doc/api/graphql/reference/gitlab_schema.json
index fc89798a131..800866e9eca 100644
--- a/doc/api/graphql/reference/gitlab_schema.json
+++ b/doc/api/graphql/reference/gitlab_schema.json
@@ -85890,4 +85890,4 @@
]
}
}
-} \ No newline at end of file
+}
diff --git a/doc/api/graphql/reference/index.md b/doc/api/graphql/reference/index.md
index 8fbd425d118..e329da46509 100644
--- a/doc/api/graphql/reference/index.md
+++ b/doc/api/graphql/reference/index.md
@@ -757,6 +757,16 @@ Autogenerated return type of BoardListUpdateLimitMetrics.
| `commit` | Commit | Commit for the branch. |
| `name` | String! | Name of the branch. |
+### BulkFindOrCreateDevopsAdoptionSegmentsPayload
+
+Autogenerated return type of BulkFindOrCreateDevopsAdoptionSegments.
+
+| Field | Type | Description |
+| ----- | ---- | ----------- |
+| `clientMutationId` | String | A unique identifier for the client performing the mutation. |
+| `errors` | String! => Array | Errors encountered during execution of the mutation. |
+| `segments` | DevopsAdoptionSegment! => Array | Created segments after mutation. |
+
### BurnupChartDailyTotals
Represents the total number of issues and their weights for a particular day.
@@ -1906,6 +1916,7 @@ Represents an epic board.
| `hideBacklogList` | Boolean | Whether or not backlog list is hidden. |
| `hideClosedList` | Boolean | Whether or not closed list is hidden. |
| `id` | BoardsEpicBoardID! | Global ID of the epic board. |
+| `labels` | LabelConnection | Labels of the board. |
| `lists` | EpicListConnection | Epic board lists. |
| `name` | String | Name of the epic board. |
| `webPath` | String! | Web path of the epic board. |
@@ -1931,6 +1942,16 @@ Autogenerated return type of EpicBoardListCreate.
| `errors` | String! => Array | Errors encountered during execution of the mutation. |
| `list` | EpicList | Epic list in the epic board. |
+### EpicBoardUpdatePayload
+
+Autogenerated return type of EpicBoardUpdate.
+
+| Field | Type | Description |
+| ----- | ---- | ----------- |
+| `clientMutationId` | String | A unique identifier for the client performing the mutation. |
+| `epicBoard` | EpicBoard | The updated epic board. |
+| `errors` | String! => Array | Errors encountered during execution of the mutation. |
+
### EpicDescendantCount
Counts of descendent epics.
@@ -2715,6 +2736,7 @@ Autogenerated return type of MarkAsSpamSnippet.
| `reference` | String! | Internal reference of the merge request. Returned in shortened format by default. |
| `reviewers` | UserConnection | Users from whom a review has been requested. |
| `securityAutoFix` | Boolean | Indicates if the merge request is created by @GitLab-Security-Bot. |
+| `securityReportsUpToDateOnTargetBranch` | Boolean! | Indicates if the target branch security reports are out of date. |
| `shouldBeRebased` | Boolean! | Indicates if the merge request will be rebased. |
| `shouldRemoveSourceBranch` | Boolean | Indicates if the source branch of the merge request will be deleted after merge. |
| `sourceBranch` | String! | Source branch of the merge request. |
diff --git a/doc/api/vulnerability_findings.md b/doc/api/vulnerability_findings.md
index 95e4774ae96..4144a912617 100644
--- a/doc/api/vulnerability_findings.md
+++ b/doc/api/vulnerability_findings.md
@@ -49,7 +49,6 @@ GET /projects/:id/vulnerability_findings?scope=all
GET /projects/:id/vulnerability_findings?scope=dismissed
GET /projects/:id/vulnerability_findings?severity=high
GET /projects/:id/vulnerability_findings?confidence=unknown,experimental
-GET /projects/:id/vulnerability_findings?scanner=bandit,find_sec_bugs
GET /projects/:id/vulnerability_findings?pipeline_id=42
```
@@ -63,7 +62,6 @@ Beginning with GitLab 12.9, the `undefined` severity and confidence level is no
| `scope` | string | no | Returns vulnerability findings for the given scope: `all` or `dismissed`. Defaults to `dismissed`. |
| `severity` | string array | no | Returns vulnerability findings belonging to specified severity level: `info`, `unknown`, `low`, `medium`, `high`, or `critical`. Defaults to all. |
| `confidence` | string array | no | Returns vulnerability findings belonging to specified confidence level: `ignore`, `unknown`, `experimental`, `low`, `medium`, `high`, or `confirmed`. Defaults to all. |
-| `scanner` | string array | no | Returns vulnerability findings detected by specified scanner.
| `pipeline_id` | integer/string | no | Returns vulnerability findings belonging to specified pipeline. |
```shell
diff --git a/doc/ci/pipelines/index.md b/doc/ci/pipelines/index.md
index c8e82de651c..d5ccb8d020c 100644
--- a/doc/ci/pipelines/index.md
+++ b/doc/ci/pipelines/index.md
@@ -137,10 +137,10 @@ To execute a pipeline manually:
1. Navigate to your project's **CI/CD > Pipelines**.
1. Select the **Run Pipeline** button.
1. On the **Run Pipeline** page:
- 1. Select the branch to run the pipeline for in the **Create for** field.
+ 1. Select the branch or tag to run the pipeline for in the **Run for branch name or tag** field.
1. Enter any [environment variables](../variables/README.md) required for the pipeline run.
You can set specific variables to have their [values prefilled in the form](#prefill-variables-in-manual-pipelines).
- 1. Click the **Create pipeline** button.
+ 1. Click the **Run pipeline** button.
The pipeline now executes the jobs as configured.
diff --git a/doc/ci/yaml/README.md b/doc/ci/yaml/README.md
index 7c42233fdef..ed829b7f0ce 100644
--- a/doc/ci/yaml/README.md
+++ b/doc/ci/yaml/README.md
@@ -46,7 +46,7 @@ The keywords available for jobs are:
| [`only`](#onlyexcept-basic) | Limit when jobs are created. |
| [`pages`](#pages) | Upload the result of a job to use with GitLab Pages. |
| [`parallel`](#parallel) | How many instances of a job should be run in parallel. |
-| [`release`](#release) | Instructs the runner to generate a [Release](../../user/project/releases/index.md) object. |
+| [`release`](#release) | Instructs the runner to generate a [release](../../user/project/releases/index.md) object. |
| [`resource_group`](#resource_group) | Limit job concurrency. |
| [`retry`](#retry) | When and how many times a job can be auto-retried in case of a failure. |
| [`rules`](#rules) | List of conditions to evaluate and determine selected attributes of a job, and whether or not it's created. |
@@ -79,7 +79,7 @@ You can't use these keywords as job names:
You can set global defaults for some keywords. Jobs that do not define one or more
of the listed keywords use the value defined in the `default:` section.
-The following job keywords can be defined inside a `default:` section:
+These job keywords can be defined inside a `default:` section:
- [`after_script`](#after_script)
- [`artifacts`](#artifacts)
@@ -92,7 +92,7 @@ The following job keywords can be defined inside a `default:` section:
- [`tags`](#tags)
- [`timeout`](#timeout)
-This example sets the `ruby:2.5` image as the default for all jobs in the pipeline.
+The following example sets the `ruby:2.5` image as the default for all jobs in the pipeline.
The `rspec 2.6` job does not use the default, because it overrides the default with
a job-specific `image:` section:
@@ -159,9 +159,9 @@ the [`needs`](#needs) keyword.
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/29654) in GitLab 12.5
-The top-level `workflow:` keyword determines whether or not a pipeline is created.
-It accepts a single `rules:` keyword that is similar to [`rules:` defined in jobs](#rules).
-Use it to define what can trigger a new pipeline.
+Use `workflow:` to determine whether or not a pipeline is created.
+Define this keyword at the top level, with a single `rules:` keyword that
+is similar to [`rules:` defined in jobs](#rules).
You can use the [`workflow:rules` templates](#workflowrules-templates) to import
a preconfigured `workflow: rules` entry.
@@ -186,7 +186,7 @@ Some example `if` clauses for `workflow: rules`:
See the [common `if` clauses for `rules`](#common-if-clauses-for-rules) for more examples.
-For example, in the following configuration, pipelines run for all `push` events (changes to
+In the following example, pipelines run for all `push` events (changes to
branches and new tags). Pipelines for push events with `-draft` in the commit message
don't run, because they are set to `when: never`. Pipelines for schedules or merge requests
don't run either, because no rules evaluate to true for them:
@@ -226,7 +226,7 @@ If your rules match both branch pipelines and merge request pipelines,
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/217732) in GitLab 13.0.
-We provide templates that set up `workflow: rules`
+GitLab provides templates that set up `workflow: rules`
for common scenarios. These templates help prevent duplicate pipelines.
The [`Branch-Pipelines` template](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/ci/templates/Workflows/Branch-Pipelines.gitlab-ci.yml)
@@ -234,12 +234,12 @@ makes your pipelines run for branches and tags.
Branch pipeline status is displayed in merge requests that use the branch
as a source. However, this pipeline type does not support any features offered by
-[Merge Request Pipelines](../merge_request_pipelines/), like
-[Pipelines for Merge Results](../merge_request_pipelines/#pipelines-for-merged-results)
-or [Merge Trains](../merge_request_pipelines/pipelines_for_merged_results/merge_trains/).
-Use this template if you are intentionally avoiding those features.
+[merge request pipelines](../merge_request_pipelines/), like
+[pipelines for merge results](../merge_request_pipelines/#pipelines-for-merged-results)
+or [merge trains](../merge_request_pipelines/pipelines_for_merged_results/merge_trains/).
+This template intentionally avoids those features.
-It is [included](#include) as follows:
+To [include](#include) it:
```yaml
include:
@@ -249,10 +249,9 @@ include:
The [`MergeRequest-Pipelines` template](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/ci/templates/Workflows/MergeRequest-Pipelines.gitlab-ci.yml)
makes your pipelines run for the default branch, tags, and
all types of merge request pipelines. Use this template if you use any of the
-the [Pipelines for Merge Requests features](../merge_request_pipelines/), as mentioned
-above.
+the [pipelines for merge requests features](../merge_request_pipelines/).
-It is [included](#include) as follows:
+To [include](#include) it:
```yaml
include:
@@ -317,7 +316,7 @@ does not block triggered pipelines.
> - Available for Starter, Premium, and Ultimate in GitLab 10.6 and later.
> - [Moved](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/42861) to GitLab Free in 11.4.
-Use the `include` keyword to include external YAML files in your CI/CD configuration.
+Use `include` to include external YAML files in your CI/CD configuration.
You can break down one long `gitlab-ci.yml` file into multiple files to increase readability,
or reduce duplication of the same configuration in multiple places.
@@ -339,11 +338,11 @@ YAML files, use [`!reference` tags](#reference-tags) or the [`extends` keyword](
| [`remote`](#includeremote) | Include a file from a remote URL. Must be publicly accessible. |
| [`template`](#includetemplate) | Include templates that are provided by GitLab. |
-The `.gitlab-ci.yml` file configuration included by all methods is evaluated when the pipeline is created.
-The configuration is a snapshot in time and persisted in the database. Any changes to
-the referenced `.gitlab-ci.yml` file configuration is not reflected in GitLab until the next pipeline is created.
+When the pipeline starts, the `.gitlab-ci.yml` file configuration included by all methods is evaluated.
+The configuration is a snapshot in time and persists in the database. GitLab does not reflect any changes to
+the referenced `.gitlab-ci.yml` file configuration until the next pipeline starts.
-The files defined by `include` are:
+The `include` files are:
- Deep merged with those in the `.gitlab-ci.yml` file.
- Always evaluated first and merged with the content of the `.gitlab-ci.yml` file,
@@ -367,13 +366,13 @@ include:
file: '.compliance-gitlab-ci.yml'
```
-For an example of how you can include these predefined variables, and their impact on CI jobs,
-see the following [CI/CD variable demo](https://youtu.be/4XR8gw3Pkos).
+For an example of how you can include these predefined variables, and the variables' impact on CI/CD jobs,
+see this [CI/CD variable demo](https://youtu.be/4XR8gw3Pkos).
#### `include:local`
-`include:local` includes a file that is in the same repository as the `.gitlab-ci.yml` file.
-It's referenced with full paths relative to the root directory (`/`).
+Use `include:local` to include a file that is in the same repository as the `.gitlab-ci.yml` file.
+Use a full path relative to the root directory (`/`).
If you use `include:local`, make sure that both the `.gitlab-ci.yml` file and the local file
are on the same branch.
@@ -390,7 +389,7 @@ include:
- local: '/templates/.gitlab-ci-template.yml'
```
-This can be defined as a short local include:
+You can also use shorter syntax to define the path:
```yaml
include: '.gitlab-ci-production.yml'
@@ -404,8 +403,9 @@ Use local includes instead of symbolic links.
To include files from another private project on the same GitLab instance,
use `include:file`. You can use `include:file` in combination with `include:project` only.
+Use a full path, relative to the root directory (`/`).
-The included file is referenced with a full path, relative to the root directory (`/`). For example:
+For example:
```yaml
include:
@@ -413,7 +413,7 @@ include:
file: '/templates/.gitlab-ci-template.yml'
```
-You can also specify a `ref`. If not specified, it defaults to the `HEAD` of the project:
+You can also specify a `ref`. If you do not specify a value, the ref defaults to the `HEAD` of the project:
```yaml
include:
@@ -502,15 +502,15 @@ to resolve all files is 30 seconds.
#### Additional `includes` examples
-There is a list of [additional `includes` examples](includes.md) available.
+View [additional `includes` examples](includes.md).
## Keyword details
-The following are detailed explanations for keywords used to configure CI/CD pipelines.
+The following topics explain how to use keywords to configure CI/CD pipelines.
### `image`
-Used to specify [a Docker image](../docker/using_docker_images.md#what-is-an-image) to use for the job.
+Use `image` to specify [a Docker image](../docker/using_docker_images.md#what-is-an-image) to use for the job.
For:
@@ -531,13 +531,13 @@ For more information, see [Available settings for `image`](../docker/using_docke
#### `services`
-Used to specify a [service Docker image](../docker/using_docker_images.md#what-is-a-service), linked to a base image specified in [`image`](#image).
+Use `services` to specify a [service Docker image](../docker/using_docker_images.md#what-is-a-service), linked to a base image specified in [`image`](#image).
For:
- Usage examples, see [Define `image` and `services` from `.gitlab-ci.yml`](../docker/using_docker_images.md#define-image-and-services-from-gitlab-ciyml).
- Detailed usage information, refer to [Docker integration](../docker/index.md) documentation.
-- For example services, see [GitLab CI/CD Services](../services/index.md).
+- Example services, see [GitLab CI/CD Services](../services/index.md).
##### `services:name`
@@ -565,8 +565,11 @@ For more information, see [Available settings for `services`](../docker/using_do
### `script`
-`script` is the only required keyword that a job needs. It's a shell script
-that is executed by the runner. For example:
+Use `script` to specify a shell script for the runner to execute.
+
+All jobs except [trigger jobs](#trigger) require a `script` keyword.
+
+For example:
```yaml
job:
@@ -575,7 +578,7 @@ job:
You can use [YAML anchors with `script`](#yaml-anchors-for-scripts).
-This keyword can also contain several commands in an array:
+The `script` keyword can also contain several commands in an array:
```yaml
job:
@@ -609,7 +612,7 @@ job:
You can verify the syntax is valid with the [CI Lint](../lint.md) tool.
-Be careful when using these special characters as well:
+Be careful when using these characters as well:
- `{`, `}`, `[`, `]`, `,`, `&`, `*`, `#`, `?`, `|`, `-`, `<`, `>`, `=`, `!`, `%`, `@`, `` ` ``.
@@ -629,10 +632,10 @@ job:
Use `before_script` to define an array of commands that should run before each job,
but after [artifacts](#artifacts) are restored.
-Scripts specified in `before_script` are concatenated with any scripts specified
-in the main [`script`](#script), and executed together in a single shell.
+Scripts you specify in `before_script` are concatenated with any scripts you specify
+in the main [`script`](#script). The combine scripts execute together in a single shell.
-It's possible to overwrite a globally defined `before_script` if you define it in a job:
+You can overwrite a globally-defined `before_script` if you define it in a job:
```yaml
default:
@@ -657,11 +660,11 @@ You can use [YAML anchors with `before_script`](#yaml-anchors-for-scripts).
Use `after_script` to define an array of commands that run after each job,
including failed jobs.
-If a job times out or is cancelled, the `after_script` commands are not executed.
-Support for executing `after_script` commands for timed-out or cancelled jobs
-[is planned](https://gitlab.com/gitlab-org/gitlab/-/issues/15603).
+If a job times out or is cancelled, the `after_script` commands do not execute.
+An [issue](https://gitlab.com/gitlab-org/gitlab/-/issues/15603) exists to support
+executing `after_script` commands for timed-out or cancelled jobs.
-Scripts specified in `after_script` are executed in a new shell, separate from any
+Scripts you specify in `after_script` execute in a new shell, separate from any
`before_script` or `script` scripts. As a result, they:
- Have a current working directory set back to the default.
@@ -694,7 +697,7 @@ You can use [YAML anchors with `after_script`](#yaml-anchors-for-scripts).
#### Script syntax
-You can use special syntax in [`script`](README.md#script) sections to:
+You can use syntax in [`script`](README.md#script) sections to:
- [Split long commands](script.md#split-long-commands) into multiline commands.
- [Use color codes](script.md#add-color-codes-to-script-output) to make job logs easier to review.
@@ -703,9 +706,19 @@ You can use special syntax in [`script`](README.md#script) sections to:
### `stage`
-`stage` is defined per-job and relies on [`stages`](#stages), which is defined
-globally. Use `stage` to define which stage a job runs in, and jobs of the same
-`stage` are executed in parallel (subject to [certain conditions](#use-your-own-runners)). For example:
+Use `stage` to define which stage a job runs in. Jobs in the same
+`stage` can execute in parallel (subject to [certain conditions](#use-your-own-runners)).
+
+Jobs without a `stage` entry use the `test` stage by default. If [`stages`](#stages)
+is not defined in the pipeline, you can use the 5 default stages, which execute in
+this order:
+
+- [`.pre`](#pre-and-post)
+- `build`
+- `test`
+- `deploy`
+- [`.post`](#pre-and-post)
+For example:
```yaml
stages:
@@ -751,45 +764,39 @@ is greater than `1`.
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/31441) in GitLab 12.4.
-The following stages are available to every pipeline:
+Use `pre` and `post` for jobs that need to run first or last in a pipeline.
-- `.pre`, which is guaranteed to always be the first stage in a pipeline.
-- `.post`, which is guaranteed to always be the last stage in a pipeline.
+- `.pre` is guaranteed to always be the first stage in a pipeline.
+- `.post` is guaranteed to always be the last stage in a pipeline.
User-defined stages are executed after `.pre` and before `.post`.
-A pipeline is not created if all jobs are in `.pre` or `.post` stages.
-
-The order of `.pre` and `.post` can't be changed, even if defined out of order in the `.gitlab-ci.yml` file.
-For example, the following are equivalent configuration:
-
-- Configured in order:
+You must have a job in at least one stage other than `.pre` or `.post`.
- ```yaml
- stages:
- - .pre
- - a
- - b
- - .post
- ```
-
-- Configured out of order:
+You can't change the order of `.pre` and `.post`, even if you define them out of order in the `.gitlab-ci.yml` file.
+For example, the following configurations are equivalent:
- ```yaml
- stages:
- - a
- - .pre
- - b
- - .post
- ```
+```yaml
+stages:
+ - .pre
+ - a
+ - b
+ - .post
+```
-- Not explicitly configured:
+```yaml
+stages:
+ - a
+ - .pre
+ - b
+ - .post
+```
- ```yaml
- stages:
- - a
- - b
- ```
+```yaml
+stages:
+ - a
+ - b
+```
### `extends`
@@ -799,7 +806,7 @@ Use `extends` to reuse configuration sections. It's an alternative to [YAML anch
and is a little more flexible and readable. You can use `extends` to reuse configuration
from [included configuration files](#use-extends-and-include-together).
-In this example, the `rspec` job uses the configuration from the `.tests` template job.
+In the following example, the `rspec` job uses the configuration from the `.tests` template job.
GitLab:
- Performs a reverse deep merge based on the keys.
@@ -870,8 +877,8 @@ In GitLab 12.0 and later, it's also possible to use multiple parents for
#### Merge details
-`extends` is able to merge hashes but not arrays.
-The algorithm used for merge is "closest scope wins", so
+You can use `extends` to merge hashes but not arrays.
+The algorithm used for merge is "closest scope wins," so
keys from the last member always override anything defined on other
levels. For example:
@@ -923,7 +930,7 @@ rspec:
- rake rspec
```
-Note that in the example above:
+In this example:
- The `variables` sections merge, but `URL: "http://docker-url.internal"` overwrites `URL: "http://my-url.internal"`.
- `tags: ['docker']` overwrites `tags: ['production']`.
@@ -935,8 +942,8 @@ Note that in the example above:
To reuse configuration from different configuration files,
combine `extends` and [`include`](#include).
-In this example, a `script` is defined in the `included.yml` file.
-Then, in the `.gitlab-ci.yml` file, you use `extends` to refer
+In the following example, a `script` is defined in the `included.yml` file.
+Then, in the `.gitlab-ci.yml` file, `extends` refers
to the contents of the `script`:
- `included.yml`:
@@ -961,11 +968,11 @@ to the contents of the `script`:
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/27863) in GitLab 12.3.
-Use the `rules` keyword to include or exclude jobs in pipelines.
+Use `rules` to include or exclude jobs in pipelines.
-Rules are evaluated *in order* until the first match. When matched, the job
+Rules are evaluated *in order* until the first match. When a match is found, the job
is either included or excluded from the pipeline, depending on the configuration.
-If included, the job also has [certain attributes](#rules-attributes)
+The job can also have [certain attributes](#rules-attributes)
added to it.
`rules` replaces [`only/except`](#onlyexcept-basic) and they can't be used together
@@ -1024,7 +1031,7 @@ The job is not added to the pipeline:
`when: always`.
- If a rule matches, and has `when: never` as the attribute.
-This example uses `if` to strictly limit when jobs run:
+The following example uses `if` to strictly limit when jobs run:
```yaml
job:
@@ -1101,7 +1108,7 @@ other pipelines, including **both** push (branch) and merge request pipelines. W
this configuration, every push to an open merge request's source branch
causes duplicated pipelines.
-There are multiple ways to avoid duplicate pipelines:
+To avoid duplicate pipelines, you can:
- Use [`workflow`](#workflow) to specify which types of pipelines
can run.
@@ -1164,12 +1171,11 @@ runs in all cases except merge requests.
#### `rules:if`
-`rules:if` clauses determine whether or not jobs are added to a pipeline by evaluating
-an `if` statement. If the `if` statement is true, the job is either included
-or excluded from a pipeline. In plain English, `if` rules can be interpreted as one of:
+Use `rules:if` clauses to specify when to add a job to a pipeline:
-- "If this rule evaluates to true, add the job" (default).
-- "If this rule evaluates to true, do not add the job" (by adding `when: never`).
+- If an `if` statement is true, add the job to the pipeline.
+- If an `if` statement is true, but it's combined with `when: never`, do not add the job to the pipeline.
+- If no `if` statements are true, do not add the job to the pipeline.
`rules:if` differs slightly from `only:variables` by accepting only a single
expression string per rule, rather than an array of them. Any set of expressions to be
@@ -1227,7 +1233,9 @@ check the value of the `$CI_PIPELINE_SOURCE` variable:
| `web` | For pipelines created by using **Run pipeline** button in the GitLab UI, from the project's **CI/CD > Pipelines** section. |
| `webide` | For pipelines created by using the [WebIDE](../../user/project/web_ide/index.md). |
-For example:
+The following example runs the job as a manual job in scheduled pipelines or in push
+pipelines (to branches or tags), with `when: on_success` (default). It does not
+add the job to any other pipeline type.
```yaml
job:
@@ -1239,11 +1247,8 @@ job:
- if: '$CI_PIPELINE_SOURCE == "push"'
```
-This example runs the job as a manual job in scheduled pipelines or in push
-pipelines (to branches or tags), with `when: on_success` (default). It does not
-add the job to any other pipeline type.
-
-Another example:
+The following example runs the job as a `when: on_success` job in [merge request pipelines](../merge_request_pipelines/index.md)
+and scheduled pipelines. It does not run in any other pipeline type.
```yaml
job:
@@ -1253,9 +1258,6 @@ job:
- if: '$CI_PIPELINE_SOURCE == "schedule"'
```
-This example runs the job as a `when: on_success` job in [merge request pipelines](../merge_request_pipelines/index.md)
-and scheduled pipelines. It does not run in any other pipeline type.
-
Other commonly used variables for `if` clauses:
- `if: $CI_COMMIT_TAG`: If changes are pushed for a tag.
@@ -1272,11 +1274,11 @@ Other commonly used variables for `if` clauses:
#### `rules:changes`
-`rules:changes` determines whether or not to add jobs to a pipeline by checking for
+Use `rules:changes` to specify when to add a job to a pipeline by checking for
changes to specific files.
-`rules: changes` works exactly the same way as [`only: changes` and `except: changes`](#onlychangesexceptchanges),
-accepting an array of paths. It's recommended to only use `rules: changes` with branch
+`rules: changes` works the same way as [`only: changes` and `except: changes`](#onlychangesexceptchanges).
+It accepts an array of paths. You should use `rules: changes` only with branch
pipelines or merge request pipelines. For example, it's common to use `rules: changes`
with merge request pipelines:
@@ -1299,7 +1301,7 @@ In this example:
- If `Dockerfile` has not changed, do not add job to any pipeline (same as `when: never`).
To use `rules: changes` with branch pipelines instead of merge request pipelines,
-change the `if:` clause in the example above to:
+change the `if:` clause in the previous example to:
```yaml
rules:
@@ -1321,7 +1323,7 @@ if there is no `if:` statement that limits the job to branch or merge request pi
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/34272) in GitLab 13.6.
> - [Feature flag removed](https://gitlab.com/gitlab-org/gitlab/-/issues/267192) in GitLab 13.7.
-CI/CD variables can be used in `rules:changes` expressions to determine when
+You can use CI/CD variables in `rules:changes` expressions to determine when
to add jobs to a pipeline:
```yaml
@@ -1334,7 +1336,7 @@ docker build:
- $DOCKERFILES_DIR/*
```
-You can use The `$` character for both variables and paths. For example, if the
+You can use the `$` character for both variables and paths. For example, if the
`$DOCKERFILES_DIR` variable exists, its value is used. If it does not exist, the
`$` is interpreted as being part of a path.
@@ -1342,10 +1344,10 @@ You can use The `$` character for both variables and paths. For example, if the
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/24021) in GitLab 12.4.
-`exists` accepts an array of paths and matches if any of these paths exist
-as files in the repository.
+Use `exists` to run a job when certain files exist in the repository.
+You can use an array of paths.
-In this example, `job` runs if a `Dockerfile` exists anywhere in the repository:
+In the following example, `job` runs if a `Dockerfile` exists anywhere in the repository:
```yaml
job:
@@ -1378,11 +1380,11 @@ For performance reasons, GitLab matches a maximum of 10,000 `exists` patterns. A
You can use [`allow_failure: true`](#allow_failure) in `rules:` to allow a job to fail, or a manual job to
wait for action, without stopping the pipeline itself. All jobs that use `rules:` default to `allow_failure: false`
-if `allow_failure:` is not defined.
+if you do not define `allow_failure:`.
The rule-level `rules:allow_failure` option overrides the job-level
-[`allow_failure`](#allow_failure) option, and is only applied when the job is
-triggered by the particular rule.
+[`allow_failure`](#allow_failure) option, and is only applied when
+the particular rule triggers the job.
```yaml
job:
@@ -1400,7 +1402,7 @@ In this example, if the first rule matches, then the job has `when: manual` and
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/209864) in GitLab 13.7.
> - [Feature flag removed](https://gitlab.com/gitlab-org/gitlab/-/issues/289803) in GitLab 13.10.
-You can use [`variables`](#variables) in `rules:` to define variables for specific conditions.
+Use [`variables`](#variables) in `rules:` to define variables for specific conditions.
For example:
@@ -1476,14 +1478,14 @@ to add jobs to pipelines, use [`rules`](#rules).
1. `except` defines the names of branches and tags the job does
**not** run for.
-There are a few rules that apply to the usage of job policy:
+A few rules apply to the usage of job policy:
- `only` and `except` are inclusive. If both `only` and `except` are defined
in a job specification, the ref is filtered by `only` and `except`.
- `only` and `except` can use regular expressions ([supported regexp syntax](#supported-onlyexcept-regexp-syntax)).
- `only` and `except` can specify a repository path to filter jobs for forks.
-In addition, `only` and `except` can use special keywords:
+In addition, `only` and `except` can use these keywords:
| **Value** | **Description** |
|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
@@ -1504,8 +1506,8 @@ Scheduled pipelines run on specific branches, so jobs configured with `only: bra
run on scheduled pipelines too. Add `except: schedules` to prevent jobs with `only: branches`
from running on scheduled pipelines.
-In the example below, `job` runs only for refs that start with `issue-`,
-whereas all branches are skipped:
+In the following example, `job` runs only for refs that start with `issue-`.
+All branches are skipped:
```yaml
job:
@@ -1517,8 +1519,8 @@ job:
- branches
```
-Pattern matching is case-sensitive by default. Use `i` flag modifier, like
-`/pattern/i` to make a pattern case-insensitive:
+Pattern matching is case-sensitive by default. Use the `i` flag modifier, like
+`/pattern/i`, to make a pattern case-insensitive:
```yaml
job:
@@ -1530,8 +1532,8 @@ job:
- branches
```
-In this example, `job` runs only for refs that are tagged, or if a build is
-explicitly requested by an API trigger or a [Pipeline Schedule](../pipelines/schedules.md):
+In the following example, `job` runs only for refs that are tagged, or if a build is
+explicitly requested by an API trigger or a [pipeline schedule](../pipelines/schedules.md):
```yaml
job:
@@ -1542,8 +1544,7 @@ job:
- schedules
```
-Use the repository path to have jobs executed only for the parent
-repository and not forks:
+To execute jobs only for the parent repository and not forks:
```yaml
job:
@@ -1554,11 +1555,11 @@ job:
- /^release/.*$/@gitlab-org/gitlab
```
-The above example runs `job` for all branches on `gitlab-org/gitlab`,
-except `master` and those with names prefixed with `release/`.
+This example runs `job` for all branches on `gitlab-org/gitlab`,
+except `master` and branches that start with `release/`.
If a job does not have an `only` rule, `only: ['branches', 'tags']` is set by
-default. If it does not have an `except` rule, it's empty.
+default. If the job does not have an `except` rule, it's empty.
For example, `job1` and `job2` are essentially the same:
@@ -1634,7 +1635,7 @@ the pipeline if the following is true:
- `(any listed refs are true) AND (any listed variables are true) AND (any listed changes are true) AND (any chosen Kubernetes status matches)`
-In the example below, the `test` job is `only` created when **all** of the following are true:
+In the following example, the `test` job is `only` created when **all** of the following are true:
- The pipeline is [scheduled](../pipelines/schedules.md) **or** runs for `master`.
- The `variables` keyword matches.
@@ -1657,7 +1658,7 @@ added if the following is true:
- `(any listed refs are true) OR (any listed variables are true) OR (any listed changes are true) OR (a chosen Kubernetes status matches)`
-In the example below, the `test` job is **not** created when **any** of the following are true:
+In the following example, the `test` job is **not** created when **any** of the following are true:
- The pipeline runs for the `master` branch.
- There are changes to the `README.md` file in the root directory of the repository.
@@ -1679,7 +1680,7 @@ test:
The `refs` strategy can take the same values as the
[simplified only/except configuration](#onlyexcept-basic).
-In the example below, the `deploy` job is created only when the
+In the following example, the `deploy` job is created only when the
pipeline is [scheduled](../pipelines/schedules.md) or runs for the `master` branch:
```yaml
@@ -1696,7 +1697,7 @@ deploy:
The `kubernetes` strategy accepts only the `active` keyword.
-In the example below, the `deploy` job is created only when the
+In the following example, the `deploy` job is created only when the
Kubernetes service is active in the project:
```yaml
@@ -1767,7 +1768,7 @@ In pipelines with [sources other than the three above](../variables/predefined_v
You can configure jobs to use `only: changes` with other `only: refs` keywords. However,
those jobs ignore the changes and always run.
-In this example, when you push commits to an existing branch, the `docker build` job
+In the following example, when you push commits to an existing branch, the `docker build` job
runs only if any of these files change:
- The `Dockerfile` file.
@@ -1864,7 +1865,7 @@ docker build service one:
- service-one/**/*
```
-In the example above, the pipeline might fail because of changes to a file in `service-one/**/*`.
+In this example, the pipeline might fail because of changes to a file in `service-one/**/*`.
A later commit that doesn't have changes in `service-one/**/*`
but does have changes to the `Dockerfile` can pass. The job
@@ -1898,13 +1899,22 @@ All files are considered to have changed when a scheduled pipeline runs.
> - In GitLab 12.3, maximum number of jobs in `needs` array raised from five to 50.
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/30631) in GitLab 12.8, `needs: []` lets jobs start immediately.
-Use the `needs:` keyword to execute jobs out-of-order. Relationships between jobs
+Use `needs:` to execute jobs out-of-order. Relationships between jobs
that use `needs` can be visualized as a [directed acyclic graph](../directed_acyclic_graph/index.md).
You can ignore stage ordering and run some jobs without waiting for others to complete.
Jobs in multiple stages can run concurrently.
-Let's consider the following example:
+The following example creates four paths of execution:
+
+- Linter: the `lint` job runs immediately without waiting for the `build` stage
+ to complete because it has no needs (`needs: []`).
+- Linux path: the `linux:rspec` and `linux:rubocop` jobs runs as soon as the `linux:build`
+ job finishes without waiting for `mac:build` to finish.
+- macOS path: the `mac:rspec` and `mac:rubocop` jobs runs as soon as the `mac:build`
+ job finishes, without waiting for `linux:build` to finish.
+- The `production` job runs as soon as all previous jobs finish; in this case:
+ `linux:build`, `linux:rspec`, `linux:rubocop`, `mac:build`, `mac:rspec`, `mac:rubocop`.
```yaml
linux:build:
@@ -1937,20 +1947,6 @@ production:
stage: deploy
```
-This example creates four paths of execution:
-
-- Linter: the `lint` job runs immediately without waiting for the `build` stage to complete because it has no needs (`needs: []`).
-
-- Linux path: the `linux:rspec` and `linux:rubocop` jobs runs as soon
- as the `linux:build` job finishes without waiting for `mac:build` to finish.
-
-- macOS path: the `mac:rspec` and `mac:rubocop` jobs runs as soon
- as the `mac:build` job finishes, without waiting for `linux:build` to finish.
-
-- The `production` job runs as soon as all previous jobs
- finish; in this case: `linux:build`, `linux:rspec`, `linux:rubocop`,
- `mac:build`, `mac:rspec`, `mac:rubocop`.
-
#### Requirements and limitations
- If `needs:` is set to point to a job that is not instantiated
@@ -1967,7 +1963,7 @@ This example creates four paths of execution:
- `needs:` is similar to `dependencies:` in that it must use jobs from prior stages,
meaning it's impossible to create circular dependencies. Depending on jobs in the
current stage is not possible either, but support [is planned](https://gitlab.com/gitlab-org/gitlab/-/issues/30632).
-- Related to the above, stages must be explicitly defined for all jobs
+- Stages must be explicitly defined for all jobs
that have the keyword `needs:` or are referred to by one.
##### Changing the `needs:` job limit **(FREE SELF)**
@@ -1990,7 +1986,7 @@ To disable directed acyclic graphs (DAG), set the limit to `0`.
Use `artifacts: true` (default) or `artifacts: false` to control when artifacts are
downloaded in jobs that use `needs`.
-In this example, the `rspec` job downloads the `build_job` artifacts, but the
+In the following example, the `rspec` job downloads the `build_job` artifacts, but the
`rubocop` job does not:
```yaml
@@ -2013,7 +2009,7 @@ rubocop:
artifacts: false
```
-In this example, the `rspec` job downloads the artifacts from all three `build_jobs`.
+In the following example, the `rspec` job downloads the artifacts from all three `build_jobs`.
`artifacts` is:
- Set to true for `build_job_1`.
@@ -2064,7 +2060,7 @@ The user running the pipeline must have at least `reporter` access to the group
Use `needs` to download artifacts from different pipelines in the current project.
Set the `project` keyword as the current project's name, and specify a ref.
-In this example, `build_job` downloads the artifacts for the latest successful
+In the following example, `build_job` downloads the artifacts for the latest successful
`build-1` job with the `other-ref` ref:
```yaml
@@ -2153,7 +2149,7 @@ available for the project.
When you register a runner, you can specify the runner's tags, for
example `ruby`, `postgres`, `development`.
-In this example, the job is run by a runner that
+In the following example, the job is run by a runner that
has both `ruby` and `postgres` tags defined.
```yaml
@@ -2202,7 +2198,7 @@ Assuming all other jobs are successful, the job's stage and its pipeline
show the same orange warning. However, the associated commit is marked as
"passed", without warnings.
-In the example below, `job1` and `job2` run in parallel, but if `job1`
+In the following example, `job1` and `job2` run in parallel. If `job1`
fails, it doesn't stop the next stage from running, because it's marked with
`allow_failure: true`:
@@ -2269,7 +2265,12 @@ The valid values of `when` are:
- With [`rules`](#rules), don't execute job.
- With [`workflow`](#workflow), don't run pipeline.
-For example:
+In the following example, the script:
+
+1. Executes `cleanup_build_job` only when `build_job` fails.
+1. Always executes `cleanup_job` as the last step in pipeline regardless of
+ success or failure.
+1. Executes `deploy_job` when you run it manually in the GitLab UI.
```yaml
stages:
@@ -2308,13 +2309,6 @@ cleanup_job:
when: always
```
-The above script:
-
-1. Executes `cleanup_build_job` only when `build_job` fails.
-1. Always executes `cleanup_job` as the last step in pipeline regardless of
- success or failure.
-1. Executes `deploy_job` when you run it manually in the GitLab UI.
-
#### `when:manual`
A manual job is a type of job that is not executed automatically and must be explicitly
@@ -2377,7 +2371,7 @@ To protect a manual job:
```
1. In the [protected environments settings](../environments/protected_environments.md#protecting-environments),
- select the environment (`production` in the example above) and add the users, roles or groups
+ select the environment (`production` in this example) and add the users, roles or groups
that are authorized to trigger the manual job to the **Allowed to Deploy** list. Only those in
this list can trigger this manual job, as well as GitLab administrators
who are always able to use protected environments.
@@ -2670,7 +2664,7 @@ as Review Apps. You can see an example that uses Review Apps at
### `cache`
-Use the `cache` keyword to specify a list of files and directories to
+Use `cache` to specify a list of files and directories to
cache between jobs. You can only use paths that are in the local working copy.
If `cache` is defined outside the scope of jobs, it's set
@@ -2771,7 +2765,7 @@ to download cache that's tagged with `test`.
If a cache with this tag is not found, you can use `CACHE_FALLBACK_KEY` to
specify a cache to use when none exists.
-In this example, if the `$CI_COMMIT_REF_SLUG` is not found, the job uses the key defined
+In the following example, if the `$CI_COMMIT_REF_SLUG` is not found, the job uses the key defined
by the `CACHE_FALLBACK_KEY` variable:
```yaml
@@ -2808,7 +2802,7 @@ cache:
- node_modules
```
-In this example we're creating a cache for Ruby and Node.js dependencies that
+This example creates a cache for Ruby and Node.js dependencies that
is tied to current versions of the `Gemfile.lock` and `package.json` files. Whenever one of
these files changes, a new cache key is computed and a new cache is created. Any future
job runs that use the same `Gemfile.lock` and `package.json` with `cache:key:files`
@@ -2942,7 +2936,7 @@ To do so, add `policy: push` to the job.
### `artifacts`
-Use the `artifacts` keyword to specify a list of files and directories that are
+Use `artifacts` to specify a list of files and directories that are
attached to the job when it [succeeds, fails, or always](#artifactswhen).
The artifacts are sent to GitLab after the job finishes. They are
@@ -3422,7 +3416,7 @@ If `retry` is set to `2`, and a job succeeds in a second run (first retry), it i
The `retry` value must be a positive integer, from `0` to `2`
(two retries maximum, three runs in total).
-This example retries all failure cases:
+The following example retries all failure cases:
```yaml
test:
@@ -3584,7 +3578,7 @@ deploystacks:
STACK: [data, processing]
```
-This example generates 10 parallel `deploystacks` jobs, each with different values
+The following example generates 10 parallel `deploystacks` jobs, each with different values
for `PROVIDER` and `STACK`:
```plaintext
@@ -3769,6 +3763,10 @@ To force the `trigger` job to wait for the downstream (multi-project or child) p
pipeline completes. At that point, the `trigger` job completes and displays the same status as
the downstream job.
+This setting can help keep your pipeline execution linear. In the following example, jobs from
+subsequent stages wait for the triggered pipeline to successfully complete before
+starting, which reduces parallelization.
+
```yaml
trigger_job:
trigger:
@@ -3776,10 +3774,6 @@ trigger_job:
strategy: depend
```
-This setting can help keep your pipeline execution linear. In the example above, jobs from
-subsequent stages wait for the triggered pipeline to successfully complete before
-starting, which reduces parallelization.
-
#### Trigger a pipeline by API call
To force a rebuild of a specific branch, tag, or commit, you can use an API call
@@ -3806,7 +3800,12 @@ When enabled, a pipeline is immediately canceled when a new pipeline starts on t
Set jobs as interruptible that can be safely canceled once started (for instance, a build job).
-For example:
+In the following example, a new pipeline run causes an existing running pipeline to be:
+
+- Canceled, if only `step-1` is running or pending.
+- Not canceled, once `step-2` starts running.
+
+After an uninterruptible job starts running, the pipeline cannot be canceled.
```yaml
stages:
@@ -3832,13 +3831,6 @@ step-3:
interruptible: true
```
-In the example above, a new pipeline run causes an existing running pipeline to be:
-
-- Canceled, if only `step-1` is running or pending.
-- Not canceled, once `step-2` starts running.
-
-When an uninterruptible job is running, the pipeline cannot be canceled, regardless of the final job's state.
-
### `resource_group`
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/15536) in GitLab 12.7.
@@ -3885,7 +3877,7 @@ executions. The [`trigger` keyword](#trigger) can trigger downstream pipelines.
[`resource_group` keyword](#resource_group) can co-exist with it. This is useful to control the
concurrency for deployment pipelines, while running non-sensitive jobs concurrently.
-This example has two pipeline configurations in a project. When a pipeline starts running,
+The following example has two pipeline configurations in a project. When a pipeline starts running,
non-sensitive jobs are executed first and aren't affected by concurrent executions in other
pipelines. However, GitLab ensures that there are no other deployment pipelines running before
triggering a deployment (child) pipeline. If other deployment pipelines are running, GitLab waits
@@ -3926,7 +3918,7 @@ deployment:
script: echo "Deploying..."
```
-Note that you must define [`strategy: depend`](#linking-pipelines-with-triggerstrategy)
+You must define [`strategy: depend`](#linking-pipelines-with-triggerstrategy)
with the `trigger` keyword. This ensures that the lock isn't released until the downstream pipeline
finishes.
@@ -3934,7 +3926,7 @@ finishes.
> [Introduced](https://gitlab.com/gitlab-org/gitlab/merge_requests/19298) in GitLab 13.2.
-`release` indicates that the job creates a [Release](../../user/project/releases/index.md).
+Use `release` to create a [release](../../user/project/releases/index.md).
These keywords are supported:
@@ -3945,18 +3937,18 @@ These keywords are supported:
- [`milestones`](#releasemilestones) (optional)
- [`released_at`](#releasereleased_at) (optional)
-The Release is created only if the job processes without error. If the Rails API
-returns an error during Release creation, the `release` job fails.
+The release is created only if the job processes without error. If the Rails API
+returns an error during release creation, the `release` job fails.
#### `release-cli` Docker image
-The Docker image to use for the `release-cli` must be specified, using the following directive:
+You must specify the Docker image to use for the `release-cli`:
```yaml
image: registry.gitlab.com/gitlab-org/release-cli:latest
```
-#### Script
+#### `script`
All jobs except [trigger](#trigger) jobs must have the `script` keyword. A `release`
job can use the output from script commands, but you can use a placeholder script if
@@ -3989,11 +3981,12 @@ android-release:
#### `release:tag_name`
-The `tag_name` must be specified. It can refer to an existing Git tag or can be specified by the user.
+You must specify a `tag_name` for the release. The tag can refer to an existing Git tag or
+you can specify a new tag.
When the specified tag doesn't exist in the repository, a new tag is created from the associated SHA of the pipeline.
-For example, when creating a Release from a Git tag:
+For example, when creating a release from a Git tag:
```yaml
job:
@@ -4012,17 +4005,17 @@ job:
description: 'Release description'
```
-- The Release is created only if the job's main script succeeds.
-- If the Release already exists, it is not updated and the job with the `release` keyword fails.
+- The release is created only if the job's main script succeeds.
+- If the release already exists, it is not updated and the job with the `release` keyword fails.
- The `release` section executes after the `script` tag and before the `after_script`.
#### `release:name`
-The Release name. If omitted, it is populated with the value of `release: tag_name`.
+The release name. If omitted, it is populated with the value of `release: tag_name`.
#### `release:description`
-Specifies the long description of the Release. You can also specify a file that contains the
+Specifies the long description of the release. You can also specify a file that contains the
description.
##### Read description from a file
@@ -4060,8 +4053,7 @@ released_at: '2021-03-15T08:00:00Z'
#### Complete example for `release`
-Combining the individual examples given above for `release` results in the following
-code snippets. There are two options, depending on how you generate the
+If you combine the previous examples for `release`, you get two options, depending on how you generate the
tags. You can't use these options together, so choose one:
- To create a release when you push a Git tag, or when you add a Git tag
@@ -4145,7 +4137,7 @@ The entries under the `release` node are transformed into a `bash` command line
to the Docker container, which contains the [release-cli](https://gitlab.com/gitlab-org/release-cli).
You can also call the `release-cli` directly from a `script` entry.
-For example, using the YAML described above:
+For example, if you use the YAML described previously:
```shell
release-cli create --name "Release $CI_COMMIT_SHA" --description "Created using the release-cli $EXTRA_DESCRIPTION" --tag-name "v${MAJOR}.${MINOR}.${REVISION}" --ref "$CI_COMMIT_SHA" --released-at "2020-07-15T08:00:00Z" --milestone "m1" --milestone "m2" --milestone "m3"
@@ -4155,7 +4147,7 @@ release-cli create --name "Release $CI_COMMIT_SHA" --description "Created using
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/33014) in GitLab 13.4.
-`secrets` indicates the [CI/CD Secrets](../secrets/index.md) this job needs. It should be a hash,
+Use `secrets` to specify the [CI/CD Secrets](../secrets/index.md) the job needs. It should be a hash,
and the keys should be the names of the variables that are made available to the job.
The value of each secret is saved in a temporary file. This file's path is stored in these
variables.
@@ -4164,7 +4156,8 @@ variables.
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/28321) in GitLab 13.4.
-`vault` keyword specifies secrets provided by [Hashicorp's Vault](https://www.vaultproject.io/).
+Use `vault` to specify secrets provided by [Hashicorp's Vault](https://www.vaultproject.io/).
+
This syntax has multiple forms. The shortest form assumes the use of the
[KV-V2](https://www.vaultproject.io/docs/secrets/kv/kv-v2) secrets engine,
mounted at the default path `kv-v2`. The last part of the secret's path is the
@@ -4202,14 +4195,13 @@ job:
### `pages`
-`pages` is a special job that uploads static content to GitLab that
-is then published as a website. It has a special syntax, so the two
-requirements below must be met:
+Use `pages` to upload static content to GitLab. The content
+is then published as a website. You must:
-- Any static content must be placed under a `public/` directory.
-- `artifacts` with a path to the `public/` directory must be defined.
+- Place any static content in a `public/` directory.
+- Define [`artifacts`](#artifacts) with a path to the `public/` directory.
-The example below moves all files from the root of the project to the
+The following example moves all files from the root of the project to the
`public/` directory. The `.public` workaround is so `cp` does not also copy
`public/` to itself in an infinite loop:
@@ -4227,14 +4219,14 @@ pages:
- master
```
-Read more on [GitLab Pages user documentation](../../user/project/pages/index.md).
+View the [GitLab Pages user documentation](../../user/project/pages/index.md).
### `inherit`
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/207484) in GitLab 12.9.
-You can disable inheritance of globally defined defaults
-and variables with the `inherit:` keyword.
+Use `inherit:` to control inheritance of globally-defined defaults
+and variables.
To enable or disable the inheritance of all `default:` or `variables:` keywords, use:
@@ -4263,7 +4255,7 @@ inherit:
- VARIABLE2
```
-In the example below:
+In the following example:
- `rubocop`:
- inherits: Nothing.
@@ -4367,7 +4359,7 @@ You can use [YAML anchors for variables](#yaml-anchors-for-variables).
> [Introduced in](https://gitlab.com/gitlab-org/gitlab/-/issues/30101) GitLab 13.7.
-You can use the `value` and `description` keywords to define [variables that are prefilled](../pipelines/index.md#prefill-variables-in-manual-pipelines)
+Use the `value` and `description` keywords to define [variables that are prefilled](../pipelines/index.md#prefill-variables-in-manual-pipelines)
when [running a pipeline manually](../pipelines/index.md#run-a-pipeline-manually):
```yaml
@@ -4379,7 +4371,7 @@ variables:
### Configure runner behavior with variables
-You can use [CI/CD variables](../variables/README.md) to configure runner Git behavior:
+You can use [CI/CD variables](../variables/README.md) to configure how the runner processes Git requests:
- [`GIT_STRATEGY`](../runners/README.md#git-strategy)
- [`GIT_SUBMODULE_STRATEGY`](../runners/README.md#git-submodule-strategy)
@@ -4395,16 +4387,16 @@ You can use [CI/CD variables](../variables/README.md) to configure runner Git be
You can also use variables to configure how many times a runner
[attempts certain stages of job execution](../runners/README.md#job-stages-attempts).
-## Special YAML features
+## YAML-specific features
-It's possible to use special YAML features like anchors (`&`), aliases (`*`)
+In your `.gitlab-ci.yml` file, you can use YAML-specific features like anchors (`&`), aliases (`*`),
and map merging (`<<`). Use these features to reduce the complexity
of the code in the `.gitlab-ci.yml` file.
Read more about the various [YAML features](https://learnxinyminutes.com/docs/yaml/).
-In most cases, the [`extends` keyword](#extends) is more user friendly and should
-be used over these special YAML features.
+In most cases, the [`extends` keyword](#extends) is more user friendly and you should
+use it when possible.
You can use YAML anchors to merge YAML arrays.
@@ -4445,8 +4437,8 @@ test2:
```
`&` sets up the name of the anchor (`job_configuration`), `<<` means "merge the
-given hash into the current one", and `*` includes the named anchor
-(`job_configuration` again). The expanded version of the example above is:
+given hash into the current one," and `*` includes the named anchor
+(`job_configuration` again). The expanded version of this example is:
```yaml
.job_template:
@@ -4586,7 +4578,7 @@ Use [YAML anchors](#anchors) with `variables` to repeat assignment
of variables across multiple jobs. You can also use YAML anchors when a job
requires a specific `variables` block that would otherwise override the global variables.
-In the example below, we override the `GIT_STRATEGY` variable without affecting
+The following example shows how override the `GIT_STRATEGY` variable without affecting
the use of the `SAMPLE_VARIABLE` variable:
```yaml
@@ -4606,7 +4598,7 @@ job_no_git_strategy:
### Hide jobs
-If you want to temporarily 'disable' a job, rather than commenting out all the
+If you want to temporarily disable a job, rather than commenting out all the
lines where the job is defined:
```yaml
@@ -4625,7 +4617,7 @@ GitLab CI/CD. In the following example, `.hidden_job` is ignored:
```
Use this feature to ignore jobs, or use the
-[special YAML features](#special-yaml-features) and transform the hidden jobs
+[YAML-specific features](#yaml-specific-features) and transform the hidden jobs
into templates.
### `!reference` tags
@@ -4637,7 +4629,7 @@ sections and reuse it in the current section. Unlike [YAML anchors](#anchors), y
use `!reference` tags to reuse configuration from [included](#include) configuration
files as well.
-In this example, a `script` and an `after_script` from two different locations are
+In the following example, a `script` and an `after_script` from two different locations are
reused in the `test` job:
- `setup.yml`:
@@ -4666,7 +4658,7 @@ reused in the `test` job:
- !reference [.teardown, after_script]
```
-In this example, `test-vars-1` reuses the all the variables in `.vars`, while `test-vars-2`
+In the following example, `test-vars-1` reuses the all the variables in `.vars`, while `test-vars-2`
selects a specific variable and reuses it as a new `MY_VAR` variable.
```yaml
diff --git a/doc/integration/elasticsearch.md b/doc/integration/elasticsearch.md
index 164ec2d32bc..01477c33011 100644
--- a/doc/integration/elasticsearch.md
+++ b/doc/integration/elasticsearch.md
@@ -38,7 +38,7 @@ each node should have:
- [Memory](https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html#_memory): 8 GiB (minimum).
- [CPU](https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html#_cpus): Modern processor with multiple cores.
-- [Storage](https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html#_disks): Use SSD storage. You will need enough storage for 50% of the total size of your Git repositories.
+- [Storage](https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html#_disks): Use SSD storage. The total storage size of all Elasticsearch nodes is about 50% of the total size of your Git repositories. It includes one primary and one replica.
A few notes on CPU and storage:
diff --git a/doc/raketasks/backup_restore.md b/doc/raketasks/backup_restore.md
index 20131a795c5..00cf23d8ffd 100644
--- a/doc/raketasks/backup_restore.md
+++ b/doc/raketasks/backup_restore.md
@@ -1,6 +1,6 @@
---
stage: Enablement
-group: Distribution
+group: Geo
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
---
diff --git a/doc/user/admin_area/settings/continuous_integration.md b/doc/user/admin_area/settings/continuous_integration.md
index f14241bb663..0756086af73 100644
--- a/doc/user/admin_area/settings/continuous_integration.md
+++ b/doc/user/admin_area/settings/continuous_integration.md
@@ -99,6 +99,13 @@ are allowed to expire.
This setting takes precedence over the [project level setting](../../../ci/pipelines/job_artifacts.md#keep-artifacts-from-most-recent-successful-jobs).
If disabled at the instance level, you cannot enable this per-project.
+To disable the setting:
+
+1. Go to **Admin Area > Settings > CI/CD**.
+1. Expand **Continuous Integration and Deployment**.
+1. Clear the **Keep the latest artifacts for all jobs in the latest successful pipelines** checkbox.
+1. Click **Save changes**
+
When you disable the feature, the latest artifacts do not immediately expire.
A new pipeline must run before the latest artifacts can expire and be deleted.