Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/reference_architectures')
-rw-r--r--doc/administration/reference_architectures/10k_users.md105
-rw-r--r--doc/administration/reference_architectures/1k_users.md2
-rw-r--r--doc/administration/reference_architectures/25k_users.md111
-rw-r--r--doc/administration/reference_architectures/2k_users.md44
-rw-r--r--doc/administration/reference_architectures/3k_users.md104
-rw-r--r--doc/administration/reference_architectures/50k_users.md107
-rw-r--r--doc/administration/reference_architectures/5k_users.md103
-rw-r--r--doc/administration/reference_architectures/index.md115
8 files changed, 480 insertions, 211 deletions
diff --git a/doc/administration/reference_architectures/10k_users.md b/doc/administration/reference_architectures/10k_users.md
index fcce44f62b2..2ca79bbeae4 100644
--- a/doc/administration/reference_architectures/10k_users.md
+++ b/doc/administration/reference_architectures/10k_users.md
@@ -14,7 +14,7 @@ full list of reference architectures, see
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
-> - **Performance tested daily with the [GitLab Performance Tool](https://gitlab.com/gitlab-org/quality/performance)**:
+> - **Validation and test results:** The Quality Engineering team does [regular smoke and performance tests](index.md#validation-and-test-results) to ensure the reference architectures remain compliant
> - **Test requests per second (RPS) rates:** API: 200 RPS, Web: 20 RPS, Git (Pull): 20 RPS, Git (Push): 4 RPS
> - **[Latest Results](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/10k)**
@@ -42,7 +42,7 @@ full list of reference architectures, see
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
@@ -1157,8 +1157,8 @@ are supported and can be added if needed.
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
NOTE:
-Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster).
-For implementations with Gitaly Sharded, the same Gitaly specs should be used. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
+Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster).
+For implementations with sharded Gitaly, use the same Gitaly specs. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
The recommended cluster setup includes the following components:
@@ -1455,10 +1455,12 @@ To configure the Praefect nodes, on each one:
### Configure Gitaly
The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
-requirements that are dependent on data, specifically the number of projects
-and those projects' sizes. It's recommended that a Gitaly Cluster stores
-no more than 5 TB of data on each node. Depending on your
-repository storage requirements, you may require additional Gitaly Clusters.
+requirements that are dependent on data and load.
+
+NOTE:
+The Reference Architecture specs have been designed with good headroom in mind
+but for Gitaly, increased specs or additional
+Gitaly Cluster arrays may be required for notably large data sets or load.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -1531,6 +1533,26 @@ On each node:
# Recommended to be enabled for improved performance but can notably increase disk I/O
# Refer to https://docs.gitlab.com/ee/administration/gitaly/configure_gitaly.html#pack-objects-cache for more info
gitaly['pack_objects_cache_enabled'] = true
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ gitaly['prometheus_listen_addr'] = '0.0.0.0:9236'
+ #
+ # END user configuration
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
@@ -1768,6 +1790,7 @@ To configure the Sidekiq nodes, on each one:
# Object Storage
## This is an example for configuring Object Storage on GCP
## Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -1914,6 +1937,7 @@ On each node perform the following:
# This is an example for configuring Object Storage on GCP
# Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -2093,6 +2117,30 @@ To configure the Monitoring node:
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
}
+ # Configure Prometheus to scrape services not covered by discovery
+ prometheus['scrape_configs'] = [
+ {
+ 'job_name': 'pgbouncer',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.31:9188",
+ "10.6.0.32:9188",
+ "10.6.0.33:9188",
+ ],
+ ],
+ },
+ {
+ 'job_name': 'praefect',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.131:9652",
+ "10.6.0.132:9652",
+ "10.6.0.133:9652",
+ ],
+ ],
+ },
+ ]
+
# Nginx - For Grafana access
nginx['enable'] = true
```
@@ -2119,7 +2167,7 @@ GitLab has been tested on a number of object storage providers:
- [Amazon S3](https://aws.amazon.com/s3/)
- [Google Cloud Storage](https://cloud.google.com/storage)
-- [Digital Ocean Spaces](https://www.digitalocean.com/products/spaces/)
+- [Digital Ocean Spaces](http://www.digitalocean.com/products/spaces)
- [Oracle Cloud Infrastructure](https://docs.cloud.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
- [OpenStack Swift](https://docs.openstack.org/swift/latest/s3_compat.html)
- [Azure Blob storage](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
@@ -2136,7 +2184,8 @@ There are two ways of specifying object storage configuration in GitLab:
Starting with GitLab 13.2, consolidated object storage configuration is available. It simplifies your GitLab configuration since the connection details are shared across object types. Refer to [Consolidated object storage configuration](../object_storage.md#consolidated-object-storage-configuration) guide for instructions on how to set it up.
GitLab Runner returns job logs in chunks which Omnibus GitLab caches temporarily on disk in `/var/opt/gitlab/gitlab-ci/builds` by default, even when using consolidated object storage. With default configuration, this directory needs to be shared via NFS on any GitLab Rails and Sidekiq nodes.
-In GitLab 13.6 and later, it's recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs.
+
+In GitLab 13.6 and later, it's also recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs. This is required when no NFS node has been deployed.
For configuring object storage in GitLab 13.1 and earlier, or for storage types not
supported by consolidated configuration form, refer to the following guides based
@@ -2236,11 +2285,11 @@ use Google Cloud's Kubernetes Engine (GKE) and associated machine types, but the
and CPU requirements should translate to most other providers. We hope to update this in the
future with further specific cloud provider details.
-| Service | Nodes | Configuration | GCP | Allocatable CPUs and Memory |
-|-------------------------------------------------------|-------|-------------------------|------------------|-----------------------------|
-| Webservice | 4 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | 127.5 vCPU, 118 GB memory |
-| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | 15.5 vCPU, 50 GB memory |
-| Supporting services such as NGINX, Prometheus | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | 7.75 vCPU, 25 GB memory |
+| Service | Nodes | Configuration | GCP | AWS | Min Allocatable CPUs and Memory |
+|-----------------------------------------------|-------|-------------------------|-----------------|--------------|---------------------------------|
+| Webservice | 4 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | `c5.9xlarge` | 127.5 vCPU, 118 GB memory |
+| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 15.5 vCPU, 50 GB memory |
+| Supporting services such as NGINX, Prometheus | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 7.75 vCPU, 25 GB memory |
- For this setup, we **recommend** and regularly [test](index.md#validation-and-test-results)
[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
@@ -2250,18 +2299,18 @@ future with further specific cloud provider details.
Next are the backend components that run on static compute VMs via Omnibus (or External PaaS
services where applicable):
-| Service | Nodes | Configuration | GCP |
-|-----------------------------------------------------|-------|-------------------------|------------------|
-| Consul<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| PostgreSQL<sup>1</sup> | 3 | 8 vCPU, 30 GB memory | `n1-standard-8` |
-| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Internal load balancing node<sup>3</sup> | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Redis/Sentinel - Cache<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| Redis/Sentinel - Persistent<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| Gitaly<sup>5</sup> | 3 | 16 vCPU, 60 GB memory | `n1-standard-16` |
-| Praefect<sup>5</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Object storage<sup>4</sup> | n/a | n/a | n/a |
+| Service | Nodes | Configuration | GCP | AWS |
+|------------------------------------------|-------|-----------------------|------------------|--------------|
+| Consul<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| PostgreSQL<sup>1</sup> | 3 | 8 vCPU, 30 GB memory | `n1-standard-8` | `m5.2xlarge` |
+| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Internal load balancing node<sup>3</sup> | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Redis/Sentinel - Cache<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| Redis/Sentinel - Persistent<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| Gitaly<sup>5</sup> | 3 | 16 vCPU, 60 GB memory | `n1-standard-16` | `m5.4xlarge` |
+| Praefect<sup>5</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Object storage<sup>4</sup> | n/a | n/a | n/a | n/a |
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
@@ -2269,7 +2318,7 @@ services where applicable):
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
diff --git a/doc/administration/reference_architectures/1k_users.md b/doc/administration/reference_architectures/1k_users.md
index 0d0e7681ffd..7213a1eb92b 100644
--- a/doc/administration/reference_architectures/1k_users.md
+++ b/doc/administration/reference_architectures/1k_users.md
@@ -21,7 +21,7 @@ many organizations.
> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
> - **Cloud Native Hybrid:** No. For a cloud native hybrid environment, you
> can follow a [modified hybrid reference architecture](#cloud-native-hybrid-reference-architecture-with-helm-charts).
-> - **Performance tested daily with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
+> - **Validation and test results:** The Quality Engineering team does [regular smoke and performance tests](index.md#validation-and-test-results) to ensure the reference architectures remain compliant
> - **Test requests per second (RPS) rates:** API: 20 RPS, Web: 2 RPS, Git (Pull): 2 RPS, Git (Push): 1 RPS
> - **[Latest Results](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/1k)**
diff --git a/doc/administration/reference_architectures/25k_users.md b/doc/administration/reference_architectures/25k_users.md
index c08fe985b40..2a1d344508e 100644
--- a/doc/administration/reference_architectures/25k_users.md
+++ b/doc/administration/reference_architectures/25k_users.md
@@ -14,7 +14,7 @@ full list of reference architectures, see
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
-> - **Performance tested weekly with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
+> - **Validation and test results:** The Quality Engineering team does [regular smoke and performance tests](index.md#validation-and-test-results) to ensure the reference architectures remain compliant
> - **Test requests per second (RPS) rates:** API: 500 RPS, Web: 50 RPS, Git (Pull): 50 RPS, Git (Push): 10 RPS
> - **[Latest Results](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/25k)**
@@ -24,7 +24,7 @@ full list of reference architectures, see
| Consul<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` | `F2s v2` |
| PostgreSQL<sup>1</sup> | 3 | 16 vCPU, 60 GB memory | `n1-standard-16` | `m5.4xlarge` | `D16s v3` |
| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` | `F2s v2` |
-| Internal load balancing node<sup>3</sup> | 1 | 4 vCPU, 3.6GB memory | `n1-highcpu-4` | `c5.large` | `F2s v2` |
+| Internal load balancing node<sup>3</sup> | 1 | 4 vCPU, 3.6GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` |
| Redis/Sentinel - Cache<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | `D4s v3` |
| Redis/Sentinel - Persistent<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | `D4s v3` |
| Gitaly<sup>5</sup> | 3 | 32 vCPU, 120 GB memory | `n1-standard-32` | `m5.8xlarge` | `D32s v3` |
@@ -42,7 +42,7 @@ full list of reference architectures, see
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
@@ -86,7 +86,7 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
-
+
redis_cache -[hidden]-> redis_persistent
}
@@ -1163,8 +1163,8 @@ fault tolerant solution for storing Git repositories.
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
NOTE:
-Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster).
-For implementations with Gitaly Sharded, the same Gitaly specs should be used. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
+Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster).
+For implementations with sharded Gitaly, use the same Gitaly specs. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
The recommended cluster setup includes the following components:
@@ -1459,10 +1459,12 @@ the file of the same name on this server. If this is the first Omnibus node you
### Configure Gitaly
The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
-requirements that are dependent on data, specifically the number of projects
-and those projects' sizes. It's recommended that a Gitaly Cluster stores
-no more than 5 TB of data on each node. Depending on your
-repository storage requirements, you may require additional Gitaly Clusters.
+requirements that are dependent on data and load.
+
+NOTE:
+The Reference Architecture specs have been designed with good headroom in mind
+but for Gitaly, increased specs or additional
+Gitaly Cluster arrays may be required for notably large data sets or load.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -1535,6 +1537,26 @@ On each node:
# Recommended to be enabled for improved performance but can notably increase disk I/O
# Refer to https://docs.gitlab.com/ee/administration/gitaly/configure_gitaly.html#pack-objects-cache for more info
gitaly['pack_objects_cache_enabled'] = true
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ gitaly['prometheus_listen_addr'] = '0.0.0.0:9236'
+ #
+ # END user configuration
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
@@ -1772,6 +1794,7 @@ To configure the Sidekiq nodes, on each one:
# Object Storage
# This is an example for configuring Object Storage on GCP
# Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -1920,6 +1943,7 @@ On each node perform the following:
# This is an example for configuring Object Storage on GCP
# Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -2098,6 +2122,30 @@ To configure the Monitoring node:
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
}
+ # Configure Prometheus to scrape services not covered by discovery
+ prometheus['scrape_configs'] = [
+ {
+ 'job_name': 'pgbouncer',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.31:9188",
+ "10.6.0.32:9188",
+ "10.6.0.33:9188",
+ ],
+ ],
+ },
+ {
+ 'job_name': 'praefect',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.131:9652",
+ "10.6.0.132:9652",
+ "10.6.0.133:9652",
+ ],
+ ],
+ },
+ ]
+
# Nginx - For Grafana access
nginx['enable'] = true
```
@@ -2123,7 +2171,7 @@ GitLab has been tested on a number of object storage providers:
- [Amazon S3](https://aws.amazon.com/s3/)
- [Google Cloud Storage](https://cloud.google.com/storage)
-- [Digital Ocean Spaces](https://www.digitalocean.com/products/spaces/)
+- [Digital Ocean Spaces](http://www.digitalocean.com/products/spaces)
- [Oracle Cloud Infrastructure](https://docs.cloud.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
- [OpenStack Swift](https://docs.openstack.org/swift/latest/s3_compat.html)
- [Azure Blob storage](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
@@ -2140,7 +2188,8 @@ There are two ways of specifying object storage configuration in GitLab:
Starting with GitLab 13.2, consolidated object storage configuration is available. It simplifies your GitLab configuration since the connection details are shared across object types. Refer to [Consolidated object storage configuration](../object_storage.md#consolidated-object-storage-configuration) guide for instructions on how to set it up.
GitLab Runner returns job logs in chunks which Omnibus GitLab caches temporarily on disk in `/var/opt/gitlab/gitlab-ci/builds` by default, even when using consolidated object storage. With default configuration, this directory needs to be shared via NFS on any GitLab Rails and Sidekiq nodes.
-In GitLab 13.6 and later, it's recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs.
+
+In GitLab 13.6 and later, it's also recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs. This is required when no NFS node has been deployed.
For configuring object storage in GitLab 13.1 and earlier, or for storage types not
supported by consolidated configuration form, refer to the following guides based
@@ -2234,11 +2283,11 @@ use Google Cloud's Kubernetes Engine (GKE) and associated machine types, but the
and CPU requirements should translate to most other providers. We hope to update this in the
future with further specific cloud provider details.
-| Service | Nodes | Configuration | GCP | Allocatable CPUs and Memory |
-|-------------------------------------------------------|-------|-------------------------|------------------|-----------------------------|
-| Webservice | 7 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | 223 vCPU, 206.5 GB memory |
-| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | 15.5 vCPU, 50 GB memory |
-| Supporting services such as NGINX, Prometheus | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | 7.75 vCPU, 25 GB memory |
+| Service | Nodes | Configuration | GCP | AWS | Min Allocatable CPUs and Memory |
+|-----------------------------------------------|-------|-------------------------|-----------------|-----------------|---------------------------------|
+| Webservice | 7 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | `c5.9xlarge` | 223 vCPU, 206.5 GB memory |
+| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 15.5 vCPU, 50 GB memory |
+| Supporting services such as NGINX, Prometheus | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 7.75 vCPU, 25 GB memory |
- For this setup, we **recommend** and regularly [test](index.md#validation-and-test-results)
[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
@@ -2248,18 +2297,18 @@ future with further specific cloud provider details.
Next are the backend components that run on static compute VMs via Omnibus (or External PaaS
services where applicable):
-| Service | Nodes | Configuration | GCP |
-|-----------------------------------------------------|-------|-------------------------|------------------|
-| Consul<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| PostgreSQL<sup>1</sup> | 3 | 16 vCPU, 60 GB memory | `n1-standard-16` |
-| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Internal load balancing node<sup>3</sup> | 1 | 4 vCPU, 3.6GB memory | `n1-highcpu-4` |
-| Redis/Sentinel - Cache<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| Redis/Sentinel - Persistent<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| Gitaly<sup>5</sup> | 3 | 32 vCPU, 120 GB memory | `n1-standard-32` |
-| Praefect<sup>5</sup> | 3 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` |
-| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Object storage<sup>4</sup> | n/a | n/a | n/a |
+| Service | Nodes | Configuration | GCP | AWS |
+|------------------------------------------|-------|------------------------|------------------|--------------|
+| Consul<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| PostgreSQL<sup>1</sup> | 3 | 16 vCPU, 60 GB memory | `n1-standard-16` | `m5.4xlarge` |
+| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Internal load balancing node<sup>3</sup> | 1 | 4 vCPU, 3.6GB memory | `n1-highcpu-4` | `c5.xlarge` |
+| Redis/Sentinel - Cache<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| Redis/Sentinel - Persistent<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| Gitaly<sup>5</sup> | 3 | 32 vCPU, 120 GB memory | `n1-standard-32` | `m5.8xlarge` |
+| Praefect<sup>5</sup> | 3 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` |
+| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Object storage<sup>4</sup> | n/a | n/a | n/a | n/a |
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
@@ -2267,7 +2316,7 @@ services where applicable):
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
@@ -2312,7 +2361,7 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
-
+
redis_cache -[hidden]-> redis_persistent
}
diff --git a/doc/administration/reference_architectures/2k_users.md b/doc/administration/reference_architectures/2k_users.md
index 6f6c02c309a..f417d1e2ab5 100644
--- a/doc/administration/reference_architectures/2k_users.md
+++ b/doc/administration/reference_architectures/2k_users.md
@@ -13,9 +13,9 @@ For a full list of reference architectures, see
> - **Supported users (approximate):** 2,000
> - **High Availability:** No. For a highly-available environment, you can
> follow a modified [3K reference architecture](3k_users.md#supported-modifications-for-lower-user-counts-ha).
-> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
+> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
> - **Cloud Native Hybrid:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
-> - **Performance tested daily with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
+> - **Validation and test results:** The Quality Engineering team does [regular smoke and performance tests](index.md#validation-and-test-results) to ensure the reference architectures remain compliant
> - **Test requests per second (RPS) rates:** API: 40 RPS, Web: 4 RPS, Git (Pull): 4 RPS, Git (Push): 1 RPS
> - **[Latest Results](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/2k)**
@@ -397,11 +397,13 @@ are supported and can be added if needed.
## Configure Gitaly
-[Gitaly](../gitaly/index.md) server node requirements are dependent on data,
-specifically the number of projects and those projects' sizes. It's recommended
-that a Gitaly server node stores no more than 5TB of data. Although this
-reference architecture includes a single Gitaly server node, you may require
-additional nodes depending on your repository storage requirements.
+[Gitaly](../gitaly/index.md) server node requirements are dependent on data size,
+specifically the number of projects and those projects' sizes.
+
+NOTE:
+The Reference Architecture specs have been designed with good headroom in mind
+but for Gitaly, increased specs or switching to Gitaly Cluster
+may be required for notably large data sets or load.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -665,6 +667,7 @@ On each node perform the following:
# Object Storage
# This is an example for configuring Object Storage on GCP
# Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -886,7 +889,7 @@ GitLab has been tested on a number of object storage providers:
- [Amazon S3](https://aws.amazon.com/s3/)
- [Google Cloud Storage](https://cloud.google.com/storage)
-- [Digital Ocean Spaces](https://www.digitalocean.com/products/spaces/)
+- [Digital Ocean Spaces](http://www.digitalocean.com/products/spaces)
- [Oracle Cloud Infrastructure](https://docs.cloud.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
- [OpenStack Swift](https://docs.openstack.org/swift/latest/s3_compat.html)
- [Azure Blob storage](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
@@ -903,7 +906,8 @@ There are two ways of specifying object storage configuration in GitLab:
Starting with GitLab 13.2, consolidated object storage configuration is available. It simplifies your GitLab configuration since the connection details are shared across object types. Refer to [Consolidated object storage configuration](../object_storage.md#consolidated-object-storage-configuration) guide for instructions on how to set it up.
GitLab Runner returns job logs in chunks which Omnibus GitLab caches temporarily on disk in `/var/opt/gitlab/gitlab-ci/builds` by default, even when using consolidated object storage. With default configuration, this directory needs to be shared via NFS on any GitLab Rails and Sidekiq nodes.
-In GitLab 13.6 and later, it's recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs.
+
+In GitLab 13.6 and later, it's also recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs. This is required when no NFS node has been deployed.
For configuring object storage in GitLab 13.1 and earlier, or for storage types not
supported by consolidated configuration form, refer to the following guides based
@@ -1000,11 +1004,11 @@ use Google Cloud's Kubernetes Engine (GKE) and associated machine types, but the
and CPU requirements should translate to most other providers. We hope to update this in the
future with further specific cloud provider details.
-| Service | Nodes | Configuration | GCP | Allocatable CPUs and Memory |
-|-------------------------------------------------------|-------|------------------------|-----------------|-----------------------------|
-| Webservice | 3 | 8 vCPU, 7.2 GB memory | `n1-highcpu-8` | 23.7 vCPU, 16.9 GB memory |
-| Sidekiq | 2 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | 3.9 vCPU, 11.8 GB memory |
-| Supporting services such as NGINX, Prometheus | 2 | 1 vCPU, 3.75 GB memory | `n1-standard-1` | 1.9 vCPU, 5.5 GB memory |
+| Service | Nodes | Configuration | GCP | AWS | Min Allocatable CPUs and Memory |
+|-----------------------------------------------|-------|------------------------|-----------------|--------------|---------------------------------|
+| Webservice | 3 | 8 vCPU, 7.2 GB memory | `n1-highcpu-8` | `c5.2xlarge` | 23.7 vCPU, 16.9 GB memory |
+| Sidekiq | 2 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | `m5.large` | 3.9 vCPU, 11.8 GB memory |
+| Supporting services such as NGINX, Prometheus | 2 | 1 vCPU, 3.75 GB memory | `n1-standard-1` | `m5.large` | 1.9 vCPU, 5.5 GB memory |
- For this setup, we **recommend** and regularly [test](index.md#validation-and-test-results)
[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
@@ -1014,12 +1018,12 @@ future with further specific cloud provider details.
Next are the backend components that run on static compute VMs via Omnibus (or External PaaS
services where applicable):
-| Service | Nodes | Configuration | GCP |
-|--------------------------------------------|-------|-------------------------|------------------|
-| PostgreSQL<sup>1</sup> | 1 | 2 vCPU, 7.5 GB memory | `n1-standard-2` |
-| Redis<sup>2</sup> | 1 | 1 vCPU, 3.75 GB memory | `n1-standard-1` |
-| Gitaly | 1 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| Object storage<sup>3</sup> | n/a | n/a | n/a |
+| Service | Nodes | Configuration | GCP | AWS |
+|----------------------------|-------|------------------------|-----------------|-------------|
+| PostgreSQL<sup>1</sup> | 1 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | `m5.large` |
+| Redis<sup>2</sup> | 1 | 1 vCPU, 3.75 GB memory | `n1-standard-1` | `m5.large` |
+| Gitaly | 1 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| Object storage<sup>3</sup> | n/a | n/a | n/a | n/a |
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
diff --git a/doc/administration/reference_architectures/3k_users.md b/doc/administration/reference_architectures/3k_users.md
index 76f81e65580..9d8be3e90b6 100644
--- a/doc/administration/reference_architectures/3k_users.md
+++ b/doc/administration/reference_architectures/3k_users.md
@@ -24,7 +24,7 @@ For a full list of reference architectures, see
> - **High Availability:** Yes, although [Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution
> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
-> - **Performance tested weekly with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
+> - **Validation and test results:** The Quality Engineering team does [regular smoke and performance tests](index.md#validation-and-test-results) to ensure the reference architectures remain compliant
> - **Test requests per second (RPS) rates:** API: 60 RPS, Web: 6 RPS, Git (Pull): 6 RPS, Git (Push): 1 RPS
> - **[Latest Results](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/3k)**
@@ -51,7 +51,7 @@ For a full list of reference architectures, see
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
@@ -1104,8 +1104,8 @@ The following IPs will be used as an example:
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
NOTE:
-Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster).
-For implementations with Gitaly Sharded, the same Gitaly specs should be used. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
+Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster).
+For implementations with sharded Gitaly, use the same Gitaly specs. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
The recommended cluster setup includes the following components:
@@ -1399,10 +1399,12 @@ the file of the same name on this server. If this is the first Omnibus node you
### Configure Gitaly
The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
-requirements that are dependent on data, specifically the number of projects
-and those projects' sizes. It's recommended that a Gitaly Cluster stores
-no more than 5 TB of data on each node. Depending on your
-repository storage requirements, you may require additional Gitaly Clusters.
+requirements that are dependent on data and load.
+
+NOTE:
+The Reference Architecture specs have been designed with good headroom in mind
+but for Gitaly, increased specs or additional
+Gitaly Cluster arrays may be required for notably large data sets or load.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -1475,6 +1477,26 @@ On each node:
# Recommended to be enabled for improved performance but can notably increase disk I/O
# Refer to https://docs.gitlab.com/ee/administration/gitaly/configure_gitaly.html#pack-objects-cache for more info
gitaly['pack_objects_cache_enabled'] = true
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ gitaly['prometheus_listen_addr'] = '0.0.0.0:9236'
+ #
+ # END user configuration
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
@@ -1696,6 +1718,7 @@ To configure the Sidekiq nodes, one each one:
# Object Storage
## This is an example for configuring Object Storage on GCP
## Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -1882,6 +1905,7 @@ On each node perform the following:
# Object storage
# This is an example for configuring Object Storage on GCP
# Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -2018,6 +2042,30 @@ running [Prometheus](../monitoring/prometheus/index.md) and
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
}
+ # Configure Prometheus to scrape services not covered by discovery
+ prometheus['scrape_configs'] = [
+ {
+ 'job_name': 'pgbouncer',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.21:9188",
+ "10.6.0.22:9188",
+ "10.6.0.23:9188",
+ ],
+ ],
+ },
+ {
+ 'job_name': 'praefect',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.131:9652",
+ "10.6.0.132:9652",
+ "10.6.0.133:9652",
+ ],
+ ],
+ },
+ ]
+
# Nginx - For Grafana access
nginx['enable'] = true
```
@@ -2058,7 +2106,7 @@ GitLab has been tested on a number of object storage providers:
- [Amazon S3](https://aws.amazon.com/s3/)
- [Google Cloud Storage](https://cloud.google.com/storage)
-- [Digital Ocean Spaces](https://www.digitalocean.com/products/spaces/)
+- [Digital Ocean Spaces](http://www.digitalocean.com/products/spaces)
- [Oracle Cloud Infrastructure](https://docs.cloud.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
- [OpenStack Swift](https://docs.openstack.org/swift/latest/s3_compat.html)
- [Azure Blob storage](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
@@ -2075,7 +2123,8 @@ There are two ways of specifying object storage configuration in GitLab:
Starting with GitLab 13.2, consolidated object storage configuration is available. It simplifies your GitLab configuration since the connection details are shared across object types. Refer to [Consolidated object storage configuration](../object_storage.md#consolidated-object-storage-configuration) guide for instructions on how to set it up.
GitLab Runner returns job logs in chunks which Omnibus GitLab caches temporarily on disk in `/var/opt/gitlab/gitlab-ci/builds` by default, even when using consolidated object storage. With default configuration, this directory needs to be shared via NFS on any GitLab Rails and Sidekiq nodes.
-In GitLab 13.6 and later, it's recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs.
+
+In GitLab 13.6 and later, it's also recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs. This is required when no NFS node has been deployed.
For configuring object storage in GitLab 13.1 and earlier, or for storage types not
supported by consolidated configuration form, refer to the following guides based
@@ -2157,6 +2206,7 @@ but with smaller performance requirements, several modifications can be consider
- Reducing the node counts: Some node types do not need consensus and can run with fewer nodes (but more than one for redundancy). This will also lead to reduced performance.
- GitLab Rails and Sidekiq: Stateless services don't have a minimum node count. Two are enough for redundancy.
- Gitaly and Praefect: A quorum is not strictly necessary. Two Gitaly nodes and two Praefect nodes are enough for redundancy.
+ - PostgreSQL and PgBouncer: A quorum is not strictly necessary. Two PostgreSQL nodes and two PgBouncer nodes are enough for redundancy.
- Running select components in reputable Cloud PaaS solutions: Select components of the GitLab setup can instead be run on Cloud Provider PaaS solutions. By doing this, additional dependent components can also be removed:
- PostgreSQL: Can be run on reputable Cloud PaaS solutions such as Google Cloud SQL or Amazon RDS. In this setup, the PgBouncer and Consul nodes are no longer required:
- Consul may still be desired if [Prometheus](../monitoring/prometheus/index.md) auto discovery is a requirement, otherwise you would need to [manually add scrape configurations](../monitoring/prometheus/index.md#adding-custom-scrape-configurations) for all nodes.
@@ -2193,11 +2243,11 @@ use Google Cloud's Kubernetes Engine (GKE) and associated machine types, but the
and CPU requirements should translate to most other providers. We hope to update this in the
future with further specific cloud provider details.
-| Service | Nodes | Configuration | GCP | Allocatable CPUs and Memory |
-|-------------------------------------------------------|-------|-------------------------|------------------|-----------------------------|
-| Webservice | 2 | 16 vCPU, 14.4 GB memory | `n1-highcpu-16` | 31.8 vCPU, 24.8 GB memory |
-| Sidekiq | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | 11.8 vCPU, 38.9 GB memory |
-| Supporting services such as NGINX, Prometheus | 2 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | 3.9 vCPU, 11.8 GB memory |
+| Service | Nodes | Configuration | GCP | AWS | Min Allocatable CPUs and Memory |
+|-----------------------------------------------|-------|-------------------------|-----------------|--------------|---------------------------------|
+| Webservice | 2 | 16 vCPU, 14.4 GB memory | `n1-highcpu-16` | `c5.4xlarge` | 31.8 vCPU, 24.8 GB memory |
+| Sidekiq | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 11.8 vCPU, 38.9 GB memory |
+| Supporting services such as NGINX, Prometheus | 2 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | `m5.large` | 3.9 vCPU, 11.8 GB memory |
- For this setup, we **recommend** and regularly [test](index.md#validation-and-test-results)
[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
@@ -2207,17 +2257,17 @@ future with further specific cloud provider details.
Next are the backend components that run on static compute VMs via Omnibus (or External PaaS
services where applicable):
-| Service | Nodes | Configuration | GCP |
-|--------------------------------------------|-------|-------------------------|------------------|
-| Redis<sup>2</sup> | 3 | 2 vCPU, 7.5 GB memory | `n1-standard-2` |
-| Consul<sup>1</sup> + Sentinel<sup>2</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| PostgreSQL<sup>1</sup> | 3 | 2 vCPU, 7.5 GB memory | `n1-standard-2` |
-| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Internal load balancing node<sup>3</sup> | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Gitaly<sup>5</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| Praefect<sup>5</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Object storage<sup>4</sup> | n/a | n/a | n/a |
+| Service | Nodes | Configuration | GCP | AWS |
+|-------------------------------------------|-------|-----------------------|-----------------|-------------|
+| Redis<sup>2</sup> | 3 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | `m5.large` |
+| Consul<sup>1</sup> + Sentinel<sup>2</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| PostgreSQL<sup>1</sup> | 3 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | `m5.large` |
+| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Internal load balancing node<sup>3</sup> | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Gitaly<sup>5</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| Praefect<sup>5</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Object storage<sup>4</sup> | n/a | n/a | n/a | n/a |
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
@@ -2225,7 +2275,7 @@ services where applicable):
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
diff --git a/doc/administration/reference_architectures/50k_users.md b/doc/administration/reference_architectures/50k_users.md
index dfa963d1ad0..6c4e2227b18 100644
--- a/doc/administration/reference_architectures/50k_users.md
+++ b/doc/administration/reference_architectures/50k_users.md
@@ -14,7 +14,7 @@ full list of reference architectures, see
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
-> - **Performance tested weekly with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
+> - **Validation and test results:** The Quality Engineering team does [regular smoke and performance tests](index.md#validation-and-test-results) to ensure the reference architectures remain compliant
> - **Test requests per second (RPS) rates:** API: 1000 RPS, Web: 100 RPS, Git (Pull): 100 RPS, Git (Push): 20 RPS
> - **[Latest Results](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/50k)**
@@ -42,7 +42,7 @@ full list of reference architectures, see
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
@@ -1170,8 +1170,8 @@ Advanced [configuration options](https://docs.gitlab.com/omnibus/settings/redis.
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
NOTE:
-Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster).
-For implementations with Gitaly Sharded, the same Gitaly specs should be used. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
+Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster).
+For implementations with sharded Gitaly, use the same Gitaly specs. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
The recommended cluster setup includes the following components:
@@ -1468,10 +1468,12 @@ the file of the same name on this server. If this is the first Omnibus node you
### Configure Gitaly
The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
-requirements that are dependent on data, specifically the number of projects
-and those projects' sizes. It's recommended that a Gitaly Cluster stores
-no more than 5 TB of data on each node. Depending on your
-repository storage requirements, you may require additional Gitaly Clusters.
+requirements that are dependent on data and load.
+
+NOTE:
+The Reference Architecture specs have been designed with good headroom in mind
+but for Gitaly, increased specs or additional
+Gitaly Cluster arrays may be required for notably large data sets or load.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -1544,6 +1546,26 @@ On each node:
# Recommended to be enabled for improved performance but can notably increase disk I/O
# Refer to https://docs.gitlab.com/ee/administration/gitaly/configure_gitaly.html#pack-objects-cache for more info
gitaly['pack_objects_cache_enabled'] = true
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ gitaly['prometheus_listen_addr'] = '0.0.0.0:9236'
+ #
+ # END user configuration
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
@@ -1781,6 +1803,7 @@ To configure the Sidekiq nodes, on each one:
# Object storage
## This is an example for configuring Object Storage on GCP
## Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -1936,6 +1959,7 @@ On each node perform the following:
# This is an example for configuring Object Storage on GCP
# Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -2114,6 +2138,30 @@ To configure the Monitoring node:
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
}
+ # Configure Prometheus to scrape services not covered by discovery
+ prometheus['scrape_configs'] = [
+ {
+ 'job_name': 'pgbouncer',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.31:9188",
+ "10.6.0.32:9188",
+ "10.6.0.33:9188",
+ ],
+ ],
+ },
+ {
+ 'job_name': 'praefect',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.131:9652",
+ "10.6.0.132:9652",
+ "10.6.0.133:9652",
+ ],
+ ],
+ },
+ ]
+
# Nginx - For Grafana access
nginx['enable'] = true
```
@@ -2139,7 +2187,7 @@ GitLab has been tested on a number of object storage providers:
- [Amazon S3](https://aws.amazon.com/s3/)
- [Google Cloud Storage](https://cloud.google.com/storage)
-- [Digital Ocean Spaces](https://www.digitalocean.com/products/spaces/)
+- [Digital Ocean Spaces](http://www.digitalocean.com/products/spaces)
- [Oracle Cloud Infrastructure](https://docs.cloud.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
- [OpenStack Swift](https://docs.openstack.org/swift/latest/s3_compat.html)
- [Azure Blob storage](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
@@ -2156,7 +2204,8 @@ There are two ways of specifying object storage configuration in GitLab:
Starting with GitLab 13.2, consolidated object storage configuration is available. It simplifies your GitLab configuration since the connection details are shared across object types. Refer to [Consolidated object storage configuration](../object_storage.md#consolidated-object-storage-configuration) guide for instructions on how to set it up.
GitLab Runner returns job logs in chunks which Omnibus GitLab caches temporarily on disk in `/var/opt/gitlab/gitlab-ci/builds` by default, even when using consolidated object storage. With default configuration, this directory needs to be shared via NFS on any GitLab Rails and Sidekiq nodes.
-In GitLab 13.6 and later, it's recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs.
+
+In GitLab 13.6 and later, it's also recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs. This is required when no NFS node has been deployed.
For configuring object storage in GitLab 13.1 and earlier, or for storage types not
supported by consolidated configuration form, refer to the following guides based
@@ -2250,12 +2299,12 @@ use Google Cloud's Kubernetes Engine (GKE) and associated machine types, but the
and CPU requirements should translate to most other providers. We hope to update this in the
future with further specific cloud provider details.
-| Service | Nodes | Configuration | GCP | Allocatable CPUs and Memory |
-|-------------------------------------------------------|-------|-------------------------|------------------|-----------------------------|
-| Webservice | 16 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | 510 vCPU, 472 GB memory |
-| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | 15.5 vCPU, 50 GB memory |
-| Supporting services such as NGINX, Prometheus | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | 7.75 vCPU, 25 GB memory |
-
+| Service | Nodes | Configuration | GCP | AWS | Min Allocatable CPUs and Memory |
+|-----------------------------------------------|-------|-------------------------|-----------------|--------------|---------------------------------|
+| Webservice | 16 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | `m5.8xlarge` | 510 vCPU, 472 GB memory |
+| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 15.5 vCPU, 50 GB memory |
+| Supporting services such as NGINX, Prometheus | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 7.75 vCPU, 25 GB memory |
+
- For this setup, we **recommend** and regularly [test](index.md#validation-and-test-results)
[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
- Nodes configuration is shown as it is forced to ensure pod vcpu / memory ratios and avoid scaling during **performance testing**.
@@ -2264,18 +2313,18 @@ future with further specific cloud provider details.
Next are the backend components that run on static compute VMs via Omnibus (or External PaaS
services where applicable):
-| Service | Nodes | Configuration | GCP |
-|-----------------------------------------------------|-------|-------------------------|------------------|
-| Consul<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| PostgreSQL<sup>1</sup> | 3 | 32 vCPU, 120 GB memory | `n1-standard-32` |
-| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Internal load balancing node<sup>3</sup> | 1 | 8 vCPU, 7.2 GB memory | `n1-highcpu-8` |
-| Redis/Sentinel - Cache<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| Redis/Sentinel - Persistent<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| Gitaly<sup>5</sup> | 3 | 64 vCPU, 240 GB memory | `n1-standard-64` |
-| Praefect<sup>5</sup> | 3 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` |
-| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Object storage<sup>4</sup> | n/a | n/a | n/a |
+| Service | Nodes | Configuration | GCP | AWS |
+|------------------------------------------|-------|------------------------|------------------|---------------|
+| Consul<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| PostgreSQL<sup>1</sup> | 3 | 32 vCPU, 120 GB memory | `n1-standard-32` | `m5.8xlarge` |
+| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Internal load balancing node<sup>3</sup> | 1 | 8 vCPU, 7.2 GB memory | `n1-highcpu-8` | `c5.2xlarge` |
+| Redis/Sentinel - Cache<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| Redis/Sentinel - Persistent<sup>2</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| Gitaly<sup>5</sup> | 3 | 64 vCPU, 240 GB memory | `n1-standard-64` | `m5.16xlarge` |
+| Praefect<sup>5</sup> | 3 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` |
+| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Object storage<sup>4</sup> | n/a | n/a | n/a | n/a |
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
@@ -2283,7 +2332,7 @@ services where applicable):
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
diff --git a/doc/administration/reference_architectures/5k_users.md b/doc/administration/reference_architectures/5k_users.md
index f2463afbf3b..dd7209a3a3a 100644
--- a/doc/administration/reference_architectures/5k_users.md
+++ b/doc/administration/reference_architectures/5k_users.md
@@ -21,7 +21,7 @@ costly-to-operate environment by using the
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
> - **Estimated Costs:** [See cost table](index.md#cost-to-run)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
-> - **Performance tested weekly with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
+> - **Validation and test results:** The Quality Engineering team does [regular smoke and performance tests](index.md#validation-and-test-results) to ensure the reference architectures remain compliant
> - **Test requests per second (RPS) rates:** API: 100 RPS, Web: 10 RPS, Git (Pull): 10 RPS, Git (Push): 2 RPS
> - **[Latest Results](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/5k)**
@@ -48,7 +48,7 @@ costly-to-operate environment by using the
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
@@ -1101,8 +1101,8 @@ The following IPs will be used as an example:
In this configuration, every Git repository is stored on every Gitaly node in the cluster, with one being designated the primary, and failover occurs automatically if the primary node goes down.
NOTE:
-Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster).
-For implementations with Gitaly Sharded, the same Gitaly specs should be used. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
+Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster).
+For implementations with sharded Gitaly, use the same Gitaly specs. Follow the [separate Gitaly documentation](../gitaly/configure_gitaly.md) instead of this section.
The recommended cluster setup includes the following components:
@@ -1397,10 +1397,12 @@ the file of the same name on this server. If this is the first Omnibus node you
### Configure Gitaly
The [Gitaly](../gitaly/index.md) server nodes that make up the cluster have
-requirements that are dependent on data, specifically the number of projects
-and those projects' sizes. It's recommended that a Gitaly Cluster stores
-no more than 5 TB of data on each node. Depending on your
-repository storage requirements, you may require additional Gitaly Clusters.
+requirements that are dependent on data and load.
+
+NOTE:
+The Reference Architecture specs have been designed with good headroom in mind
+but for Gitaly, increased specs or additional
+Gitaly Cluster arrays may be required for notably large data sets or load.
Due to Gitaly having notable input and output requirements, we strongly
recommend that all Gitaly nodes use solid-state drives (SSDs). These SSDs
@@ -1473,6 +1475,26 @@ On each node:
# Recommended to be enabled for improved performance but can notably increase disk I/O
# Refer to https://docs.gitlab.com/ee/administration/gitaly/configure_gitaly.html#pack-objects-cache for more info
gitaly['pack_objects_cache_enabled'] = true
+
+ # Configure the Consul agent
+ consul['enable'] = true
+ ## Enable service discovery for Prometheus
+ consul['monitoring_service_discovery'] = true
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ ## The IPs of the Consul server nodes
+ ## You can also use FQDNs and intermix them with IPs
+ consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
+ }
+
+ # Set the network addresses that the exporters will listen on for monitoring
+ node_exporter['listen_address'] = '0.0.0.0:9100'
+ gitaly['prometheus_listen_addr'] = '0.0.0.0:9236'
+ #
+ # END user configuration
```
1. Append the following to `/etc/gitlab/gitlab.rb` for each respective server:
@@ -1693,6 +1715,7 @@ To configure the Sidekiq nodes, one each one:
# Object Storage
## This is an example for configuring Object Storage on GCP
## Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -1867,6 +1890,7 @@ On each node perform the following:
# This is an example for configuring Object Storage on GCP
# Replace this config with your chosen Object Storage provider as desired
+ gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
'provider' => 'Google',
'google_project' => '<gcp-project-name>',
@@ -2018,6 +2042,30 @@ running [Prometheus](../monitoring/prometheus/index.md) and
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
}
+ # Configure Prometheus to scrape services not covered by discovery
+ prometheus['scrape_configs'] = [
+ {
+ 'job_name': 'pgbouncer',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.31:9188",
+ "10.6.0.32:9188",
+ "10.6.0.33:9188",
+ ],
+ ],
+ },
+ {
+ 'job_name': 'praefect',
+ 'static_configs' => [
+ 'targets' => [
+ "10.6.0.131:9652",
+ "10.6.0.132:9652",
+ "10.6.0.133:9652",
+ ],
+ ],
+ },
+ ]
+
# Nginx - For Grafana access
nginx['enable'] = true
```
@@ -2058,7 +2106,7 @@ GitLab has been tested on a number of object storage providers:
- [Amazon S3](https://aws.amazon.com/s3/)
- [Google Cloud Storage](https://cloud.google.com/storage)
-- [Digital Ocean Spaces](https://www.digitalocean.com/products/spaces/)
+- [Digital Ocean Spaces](http://www.digitalocean.com/products/spaces)
- [Oracle Cloud Infrastructure](https://docs.cloud.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
- [OpenStack Swift](https://docs.openstack.org/swift/latest/s3_compat.html)
- [Azure Blob storage](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
@@ -2075,7 +2123,8 @@ There are two ways of specifying object storage configuration in GitLab:
Starting with GitLab 13.2, consolidated object storage configuration is available. It simplifies your GitLab configuration since the connection details are shared across object types. Refer to [Consolidated object storage configuration](../object_storage.md#consolidated-object-storage-configuration) guide for instructions on how to set it up.
GitLab Runner returns job logs in chunks which Omnibus GitLab caches temporarily on disk in `/var/opt/gitlab/gitlab-ci/builds` by default, even when using consolidated object storage. With default configuration, this directory needs to be shared via NFS on any GitLab Rails and Sidekiq nodes.
-In GitLab 13.6 and later, it's recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs.
+
+In GitLab 13.6 and later, it's also recommended to switch to [Incremental logging](../job_logs.md#incremental-logging-architecture), which uses Redis instead of disk space for temporary caching of job logs. This is required when no NFS node has been deployed.
For configuring object storage in GitLab 13.1 and earlier, or for storage types not
supported by consolidated configuration form, refer to the following guides based
@@ -2169,11 +2218,11 @@ use Google Cloud's Kubernetes Engine (GKE) and associated machine types, but the
and CPU requirements should translate to most other providers. We hope to update this in the
future with further specific cloud provider details.
-| Service | Nodes | Configuration | GCP | Allocatable CPUs and Memory |
-|-------------------------------------------------------|-------|-------------------------|------------------|-----------------------------|
-| Webservice | 5 | 16 vCPU, 14.4 GB memory | `n1-highcpu-16` | 79.5 vCPU, 62 GB memory |
-| Sidekiq | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | 11.8 vCPU, 38.9 GB memory |
-| Supporting services such as NGINX, Prometheus | 2 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | 3.9 vCPU, 11.8 GB memory |
+| Service | Nodes | Configuration | GCP | AWS | Min Allocatable CPUs and Memory |
+|-----------------------------------------------|-------|-------------------------|-----------------|--------------|---------------------------------|
+| Webservice | 5 | 16 vCPU, 14.4 GB memory | `n1-highcpu-16` | `c5.4xlarge` | 79.5 vCPU, 62 GB memory |
+| Sidekiq | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 11.8 vCPU, 38.9 GB memory |
+| Supporting services such as NGINX, Prometheus | 2 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | `m5.large` | 3.9 vCPU, 11.8 GB memory |
- For this setup, we **recommend** and regularly [test](index.md#validation-and-test-results)
[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
@@ -2183,17 +2232,17 @@ future with further specific cloud provider details.
Next are the backend components that run on static compute VMs via Omnibus (or External PaaS
services where applicable):
-| Service | Nodes | Configuration | GCP |
-|--------------------------------------------|-------|-------------------------|------------------|
-| Redis<sup>2</sup> | 3 | 2 vCPU, 7.5 GB memory | `n1-standard-2` |
-| Consul<sup>1</sup> + Sentinel<sup>2</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| PostgreSQL<sup>1</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` |
-| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Internal load balancing node<sup>3</sup> | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Gitaly<sup>5</sup> | 3 | 8 vCPU, 30 GB memory | `n1-standard-8` |
-| Praefect<sup>5</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` |
-| Object storage<sup>4</sup> | n/a | n/a | n/a |
+| Service | Nodes | Configuration | GCP | AWS |
+|-------------------------------------------|-------|-----------------------|-----------------|--------------|
+| Redis<sup>2</sup> | 3 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | `m5.large` |
+| Consul<sup>1</sup> + Sentinel<sup>2</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| PostgreSQL<sup>1</sup> | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` |
+| PgBouncer<sup>1</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Internal load balancing node<sup>3</sup> | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Gitaly<sup>5</sup> | 3 | 8 vCPU, 30 GB memory | `n1-standard-8` | `m5.2xlarge` |
+| Praefect<sup>5</sup> | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Praefect PostgreSQL<sup>1</sup> | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` |
+| Object storage<sup>4</sup> | n/a | n/a | n/a | n/a |
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
@@ -2201,7 +2250,7 @@ services where applicable):
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
-5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Please [review the existing technical limitations and considerations prior to deploying Gitaly Cluster](../gitaly/index.md#guidance-regarding-gitaly-cluster). If Gitaly Sharded is desired, the same specs listed above for `Gitaly` should be used.
+5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
<!-- markdownlint-enable MD029 -->
NOTE:
diff --git a/doc/administration/reference_architectures/index.md b/doc/administration/reference_architectures/index.md
index 3fcd6d7ae4e..bb741c39c08 100644
--- a/doc/administration/reference_architectures/index.md
+++ b/doc/administration/reference_architectures/index.md
@@ -18,20 +18,6 @@ you scale GitLab accordingly.
![Reference Architectures](img/reference-architectures.png)
<!-- Internal link: https://docs.google.com/spreadsheets/d/1obYP4fLKkVVDOljaI3-ozhmCiPtEeMblbBKkf2OADKs/edit#gid=1403207183 -->
-Testing on these reference architectures was performed with the
-[GitLab Performance Tool](https://gitlab.com/gitlab-org/quality/performance)
-at specific coded workloads, and the throughputs used for testing were
-calculated based on sample customer data. Select the
-[reference architecture](#available-reference-architectures) that matches your scale.
-
-Each endpoint type is tested with the following number of requests per second (RPS)
-per 1,000 users:
-
-- API: 20 RPS
-- Web: 2 RPS
-- Git (Pull): 2 RPS
-- Git (Push): 0.4 RPS (rounded to nearest integer)
-
For GitLab instances with less than 2,000 users, it's recommended that you use
the [default setup](#automated-backups) by
[installing GitLab](../../install/index.md) on a single machine to minimize
@@ -48,7 +34,8 @@ When scaling GitLab, there are several factors to consider:
- A load balancer is added in front to distribute traffic across the application nodes.
- The application nodes connects to a shared file server and PostgreSQL and Redis services on the backend.
-NOTE:
+## Available reference architectures
+
Depending on your workflow, the following recommended reference architectures
may need to be adapted accordingly. Your workload is influenced by factors
including how active your users are, how much automation you use, mirroring,
@@ -57,12 +44,10 @@ provided by [GCP machine types](https://cloud.google.com/compute/docs/machine-ty
For different cloud vendors, attempt to select options that best match the
provided architecture.
-## Available reference architectures
-
-The following reference architectures are available.
-
### GitLab package (Omnibus)
+The following reference architectures, where the GitLab package is used, are available:
+
- [Up to 1,000 users](1k_users.md)
- [Up to 2,000 users](2k_users.md)
- [Up to 3,000 users](3k_users.md)
@@ -87,17 +72,53 @@ to get assistance from Support with troubleshooting the [2,000 users](2k_users.m
and higher reference architectures.
[Read more about our definition of scaled architectures](https://about.gitlab.com/support/#definition-of-scaled-architecture).
-### Validation and test results
+## Validation and test results
+
+The [Quality Engineering team](https://about.gitlab.com/handbook/engineering/quality/quality-engineering/)
+does regular smoke and performance tests for the reference architectures to ensure they
+remain compliant.
+
+### Why we perform the tests
+
+The Quality Department has a focus on measuring and improving the performance
+of GitLab, as well as creating and validating reference architectures that
+self-managed customers can rely on as performant configurations.
+
+For more information, see our [handbook page](https://about.gitlab.com/handbook/engineering/quality/performance-and-scalability/).
+
+### How we perform the tests
+
+Testing occurs against all reference architectures and cloud providers in an automated and ad-hoc fashion. This is done by two tools:
+
+- The [GitLab Environment Toolkit](https://gitlab.com/gitlab-org/gitlab-environment-toolkit) for building the environments.
+- The [GitLab Performance Tool](https://gitlab.com/gitlab-org/quality/performance) for performance testing.
+
+Network latency on the test environments between components on all Cloud Providers were measured at <5ms. Note that this is shared as an observation and not as an implicit recommendation.
+
+We aim to have a "test smart" approach where architectures tested have a good range that can also apply to others. Testing focuses on 10k Omnibus on GCP as the testing has shown this is a good bellwether for the other architectures and cloud providers as well as Cloud Native Hybrids.
+
+The Standard Reference Architectures are designed to be platform agnostic, with everything being run on VMs via [Omnibus GitLab](https://docs.gitlab.com/omnibus/). While testing occurs primarily on GCP, ad-hoc testing has shown that they perform similarly on equivalently specced hardware on other Cloud Providers or if run on premises (bare-metal).
+
+Testing on these reference architectures is performed with the
+[GitLab Performance Tool](https://gitlab.com/gitlab-org/quality/performance)
+at specific coded workloads, and the throughputs used for testing are
+calculated based on sample customer data. Select the
+[reference architecture](#available-reference-architectures) that matches your scale.
-The [Quality Engineering - Enablement team](https://about.gitlab.com/handbook/engineering/quality/quality-engineering/) does regular smoke and performance tests for the reference architectures to ensure they remain compliant.
+Each endpoint type is tested with the following number of requests per second (RPS)
+per 1,000 users:
-- Testing occurs against all reference architectures and cloud providers in an automated and ad-hoc fashion. This is done by two tools:
- - The [GitLab Environment Toolkit](https://gitlab.com/gitlab-org/gitlab-environment-toolkit) for building the environments.
- - The [GitLab Performance Tool](https://gitlab.com/gitlab-org/quality/performance) for performance testing.
-- Network latency on the test environments between components on all Cloud Providers were measured at <5ms. Note that this is shared as an observation and not as an implicit recommendation.
-- We aim to have a "test smart" approach where architectures tested have a good range that can also apply to others. Testing focuses on 10k Omnibus on GCP as the testing has shown this is a good bellwether for the other architectures and cloud providers as well as Cloud Native Hybrids.
-- Testing is done publicly and all results are shared.
-- For more information about performance testing at GitLab, read [how our QA team leverages GitLab’s performance testing tool (and you can too)](https://about.gitlab.com/blog/2020/02/18/how-were-building-up-performance-testing-of-gitlab/).
+- API: 20 RPS
+- Web: 2 RPS
+- Git (Pull): 2 RPS
+- Git (Push): 0.4 RPS (rounded to nearest integer)
+
+### How to interpret the results
+
+NOTE:
+Read our blog post on [how our QA team leverages GitLab’s performance testing tool](https://about.gitlab.com/blog/2020/02/18/how-were-building-up-performance-testing-of-gitlab/).
+
+Testing is done publicly and all results are shared.
The following table details the testing done against the reference architectures along with the frequency and results. Additional testing is continuously evaluated, and the table is updated accordingly.
@@ -192,9 +213,7 @@ table.test-coverage th {
</tr>
</table>
-The Standard Reference Architectures are designed to be platform agnostic, with everything being run on VMs via [Omnibus GitLab](https://docs.gitlab.com/omnibus/). While testing occurs primarily on GCP, ad-hoc testing has shown that they perform similarly on equivalently specced hardware on other Cloud Providers or if run on premises (bare-metal).
-
-### Cost to run
+## Cost to run
<table class="test-coverage">
<col>
@@ -217,61 +236,61 @@ The Standard Reference Architectures are designed to be platform agnostic, with
<th scope="row">1k</th>
<td><a href="https://cloud.google.com/products/calculator#id=a6d6a94a-c7dc-4c22-85c4-7c5747f272ed">Calculated cost</a></td>
<td></td>
+ <td><a href="https://calculator.aws/#/estimate?id=b51f178f4403b69a63f6eb33ea425f82de3bf249">Calculated cost</a></td>
<td></td>
- <td></td>
- <td></td>
+ <td><a href="https://azure.com/e/1adf30bef7e34ceba9efa97c4470417b">Calculated cost</a></td>
</tr>
<tr>
<th scope="row">2k</th>
<td><a href="https://cloud.google.com/products/calculator#id=84d11491-d72a-493c-a16e-650931faa658">Calculated cost</a></td>
<td></td>
+ <td><a href="https://calculator.aws/#/estimate?id=dce36b5cb6ab25211f74e47233d77f58fefb54e2">Calculated cost</a></td>
<td></td>
- <td></td>
- <td></td>
+ <td><a href="https://azure.com/e/72764902f3854f798407fb03c3de4b6f">Calculated cost</a></td>
</tr>
<tr>
<th scope="row">3k</th>
<td><a href="https://cloud.google.com/products/calculator/#id=ac4838e6-9c40-4a36-ac43-6d1bc1843e08">Calculated cost</a></td>
<td></td>
+ <td><a href="https://calculator.aws/#/estimate?id=b1c5b4e32e990eaeb035a148255132bd28988760">Calculated cost</a></td>
<td></td>
- <td></td>
- <td></td>
+ <td><a href="https://azure.com/e/0dbfc575051943b9970e5d8ace03680d">Calculated cost</a></td>
</tr>
<tr>
<th scope="row">5k</th>
<td><a href="https://cloud.google.com/products/calculator/#id=8742e8ea-c08f-4e0a-b058-02f3a1c38a2f">Calculated cost</a></td>
<td></td>
+ <td><a href="https://calculator.aws/#/estimate?id=2bf1af883096e6f4c6efddb4f3c35febead7fec2">Calculated cost</a></td>
<td></td>
- <td></td>
- <td></td>
+ <td><a href="https://azure.com/e/8f618711ffec4b039f1581871ca6a7c9">Calculated cost</a></td>
</tr>
<tr>
<th scope="row">10k</th>
<td><a href="https://cloud.google.com/products/calculator#id=e77713f6-dc0b-4bb3-bcef-cea904ac8efd">Calculated cost</a></td>
<td></td>
+ <td><a href="https://calculator.aws/#/estimate?id=1d374df13c0f2088d332ab0134f5b1d0f717259e">Calculated cost</a></td>
<td></td>
- <td></td>
- <td></td>
+ <td><a href="https://azure.com/e/de3da8286dda4d4db1362932bc75410b">Calculated cost</a></td>
</tr>
<tr>
<th scope="row">25k</th>
<td><a href="https://cloud.google.com/products/calculator#id=925386e1-c01c-4c0a-8d7d-ebde1824b7b0">Calculated cost</a></td>
<td></td>
+ <td><a href="https://calculator.aws/#/estimate?id=46fe6a6e9256d9b7779fae59fbbfa7e836942b7d">Calculated cost</a></td>
<td></td>
- <td></td>
- <td></td>
+ <td><a href="https://azure.com/e/69724ebd82914a60857da6a3ace05a64">Calculate cost</a></td>
</tr>
<tr>
<th scope="row">50k</th>
<td><a href="https://cloud.google.com/products/calculator/#id=8006396b-88ee-40cd-a1c8-77cdefa4d3c8">Calculated cost</a></td>
<td></td>
+ <td><a href="https://calculator.aws/#/estimate?id=e15926b1a3c7139e4faf390a3875ff807d2ab91c">Calculated cost</a></td>
<td></td>
- <td></td>
- <td></td>
+ <td><a href="https://azure.com/e/3f973040ebc14023933d35f576c89846">Calculated cost</a></td>
</tr>
</table>
-### Recommended cloud providers and services
+## Recommended cloud providers and services
NOTE:
The following lists are non exhaustive. Generally, other cloud providers not listed
@@ -347,7 +366,7 @@ The following specific cloud provider services have been found to have issues in
- [Azure Blob Storage](https://azure.microsoft.com/en-gb/services/storage/blobs/) has been found to have performance limits that can impact production use at certain times. For larger Reference Architectures the service may not be sufficient for production use and an alternative is recommended for use instead.
- [Azure Database for PostgreSQL Server](https://azure.microsoft.com/en-gb/services/postgresql/#overview) (Single / Flexible) is not recommended for use due to notable performance issues or missing functionality.
-- [AWS Aurora Database](https://aws.amazon.com/rds/aurora) is not recommended due to compatibility issues.
+- [AWS Aurora Database](https://aws.amazon.com/rds/aurora/) is not recommended due to compatibility issues.
NOTE:
As a general rule we unfortunately don't recommend Azure Services at this time.
@@ -411,7 +430,7 @@ to any of the [available reference architectures](#available-reference-architect
> - Required domain knowledge: PostgreSQL, HAProxy, shared storage, distributed systems
GitLab supports [zero-downtime upgrades](../../update/zero_downtime.md).
-Single GitLab nodes can be updated with only a [few minutes of downtime](../../update/zero_downtime.md#single-node-deployment).
+Single GitLab nodes can be updated with only a [few minutes of downtime](../../update/index.md#upgrade-based-on-installation-method).
To avoid this, we recommend to separate GitLab into several application nodes.
As long as at least one of each component is online and capable of handling the instance's usage load, your team's productivity will not be interrupted during the update.