Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2021-12-20 16:37:47 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2021-12-20 16:37:47 +0300
commitaee0a117a889461ce8ced6fcf73207fe017f1d99 (patch)
tree891d9ef189227a8445d83f35c1b0fc99573f4380 /doc/administration
parent8d46af3258650d305f53b819eabf7ab18d22f59e (diff)
Add latest changes from gitlab-org/gitlab@14-6-stable-eev14.6.0-rc42
Diffstat (limited to 'doc/administration')
-rw-r--r--doc/administration/audit_events.md17
-rw-r--r--doc/administration/auth/atlassian.md13
-rw-r--r--doc/administration/auth/authentiq.md14
-rw-r--r--doc/administration/auth/cognito.md28
-rw-r--r--doc/administration/auth/crowd.md12
-rw-r--r--doc/administration/auth/jwt.md22
-rw-r--r--doc/administration/auth/ldap/index.md102
-rw-r--r--doc/administration/auth/ldap/ldap-troubleshooting.md25
-rw-r--r--doc/administration/auth/ldap/ldap_synchronization.md4
-rw-r--r--doc/administration/auth/oidc.md179
-rw-r--r--doc/administration/cicd.md4
-rw-r--r--doc/administration/clusters/kas.md12
-rw-r--r--doc/administration/compliance.md2
-rw-r--r--doc/administration/database_load_balancing.md273
-rw-r--r--doc/administration/environment_variables.md1
-rw-r--r--doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md18
-rw-r--r--doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md6
-rw-r--r--doc/administration/geo/index.md33
-rw-r--r--doc/administration/geo/replication/configuration.md2
-rw-r--r--doc/administration/geo/replication/datatypes.md100
-rw-r--r--doc/administration/geo/replication/disable_geo.md2
-rw-r--r--doc/administration/geo/replication/faq.md2
-rw-r--r--doc/administration/geo/replication/troubleshooting.md2
-rw-r--r--doc/administration/geo/replication/updating_the_geo_nodes.md9
-rw-r--r--doc/administration/geo/replication/version_specific_updates.md22
-rw-r--r--doc/administration/geo/secondary_proxy/index.md59
-rw-r--r--doc/administration/geo/setup/index.md1
-rw-r--r--doc/administration/gitaly/configure_gitaly.md20
-rw-r--r--doc/administration/gitaly/index.md38
-rw-r--r--doc/administration/gitaly/praefect.md458
-rw-r--r--doc/administration/gitaly/recovery.md418
-rw-r--r--doc/administration/gitaly/troubleshooting.md214
-rw-r--r--doc/administration/img/db_load_balancing_postgres_stats.pngbin21543 -> 0 bytes
-rw-r--r--doc/administration/incoming_email.md20
-rw-r--r--doc/administration/index.md6
-rw-r--r--doc/administration/instance_limits.md44
-rw-r--r--doc/administration/instance_review.md2
-rw-r--r--doc/administration/integration/terminal.md2
-rw-r--r--doc/administration/job_artifacts.md164
-rw-r--r--doc/administration/lfs/index.md129
-rw-r--r--doc/administration/logs.md18
-rw-r--r--doc/administration/monitoring/gitlab_self_monitoring_project/index.md5
-rw-r--r--doc/administration/monitoring/performance/performance_bar.md9
-rw-r--r--doc/administration/monitoring/prometheus/gitlab_metrics.md35
-rw-r--r--doc/administration/monitoring/prometheus/index.md8
-rw-r--r--doc/administration/nfs.md18
-rw-r--r--doc/administration/object_storage.md7
-rw-r--r--doc/administration/operations/extra_sidekiq_processes.md6
-rw-r--r--doc/administration/operations/moving_repositories.md7
-rw-r--r--doc/administration/operations/puma.md2
-rw-r--r--doc/administration/package_information/deprecated_os.md84
-rw-r--r--doc/administration/package_information/deprecation_policy.md26
-rw-r--r--doc/administration/package_information/index.md10
-rw-r--r--doc/administration/package_information/supported_os.md90
-rw-r--r--doc/administration/packages/container_registry.md69
-rw-r--r--doc/administration/packages/index.md20
-rw-r--r--doc/administration/pages/index.md29
-rw-r--r--doc/administration/pages/source.md4
-rw-r--r--doc/administration/postgresql/database_load_balancing.md234
-rw-r--r--doc/administration/postgresql/img/pg_ha_architecture.pngbin18308 -> 0 bytes
-rw-r--r--doc/administration/postgresql/pgbouncer.md2
-rw-r--r--doc/administration/postgresql/replication_and_failover.md216
-rw-r--r--doc/administration/raketasks/maintenance.md2
-rw-r--r--doc/administration/raketasks/storage.md12
-rw-r--r--doc/administration/raketasks/uploads/migrate.md22
-rw-r--r--doc/administration/read_only_gitlab.md8
-rw-r--r--doc/administration/redis/troubleshooting.md12
-rw-r--r--doc/administration/reference_architectures/10k_users.md158
-rw-r--r--doc/administration/reference_architectures/1k_users.md53
-rw-r--r--doc/administration/reference_architectures/25k_users.md160
-rw-r--r--doc/administration/reference_architectures/2k_users.md56
-rw-r--r--doc/administration/reference_architectures/3k_users.md195
-rw-r--r--doc/administration/reference_architectures/50k_users.md162
-rw-r--r--doc/administration/reference_architectures/5k_users.md183
-rw-r--r--doc/administration/reference_architectures/index.md35
-rw-r--r--doc/administration/reference_architectures/troubleshooting.md13
-rw-r--r--doc/administration/repository_storage_types.md4
-rw-r--r--doc/administration/terraform_state.md34
-rw-r--r--doc/administration/troubleshooting/elasticsearch.md2
-rw-r--r--doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md68
-rw-r--r--doc/administration/troubleshooting/group_saml_scim.md4
-rw-r--r--doc/administration/troubleshooting/img/okta_setting_username.pngbin0 -> 69815 bytes
-rw-r--r--doc/administration/troubleshooting/img/sidekiq_flamegraph.pngbin0 -> 54473 bytes
-rw-r--r--doc/administration/troubleshooting/navigating_gitlab_via_rails_console.md22
-rw-r--r--doc/administration/troubleshooting/sidekiq.md35
-rw-r--r--doc/administration/troubleshooting/tracing_correlation_id.md7
-rw-r--r--doc/administration/uploads.md74
87 files changed, 2688 insertions, 2011 deletions
diff --git a/doc/administration/audit_events.md b/doc/administration/audit_events.md
index 2062016ef03..06ad16bbcba 100644
--- a/doc/administration/audit_events.md
+++ b/doc/administration/audit_events.md
@@ -9,8 +9,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
GitLab offers a way to view the changes made within the GitLab server for owners and administrators
on a [paid plan](https://about.gitlab.com/pricing/).
-GitLab system administrators can also take advantage of the logs located on the
-file system. See [the logs system documentation](logs.md#audit_jsonlog) for more details.
+GitLab system administrators can also view all audit events by accessing the [`audit_json.log` file](logs.md#audit_jsonlog).
You can:
@@ -31,6 +30,11 @@ permission level, who added a new user, or who removed a user.
- Track which users have access to a certain group of projects
in GitLab, and who gave them that permission level.
+## Retention policy
+
+There is no retention policy in place for audit events.
+See the [Specify a retention period for audit events](https://gitlab.com/gitlab-org/gitlab/-/issues/8137) for more information.
+
## List of events
There are two kinds of events logged:
@@ -97,7 +101,8 @@ From there, you can see the following actions:
- 2FA enforcement or grace period changed.
- Roles allowed to create project changed.
- Group CI/CD variable added, removed, or protected status changed. [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/30857) in GitLab 13.3.
-- Compliance framework created, updated, or deleted. [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/340649) in GitLab 14.6.
+- Compliance framework created, updated, or deleted. [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/340649) in GitLab 14.5.
+- Event streaming destination created, updated, or deleted. [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/344664) in GitLab 14.6.
Group events can also be accessed via the [Group Audit Events API](../api/audit_events.md#group-audit-events)
@@ -128,6 +133,10 @@ From there, you can see the following actions:
- Release was updated
- Release milestone associations changed
- Permission to approve merge requests by committers was updated ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/7531) in GitLab 12.9)
+- Permission to approve merge requests by committers was updated.
+ - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/7531) in GitLab 12.9.
+ - Message for event [changed](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/72623/diffs) in GitLab 14.6.
+
- Permission to approve merge requests by authors was updated ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/7531) in GitLab 12.9)
- Number of required approvals was updated ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/7531) in GitLab 12.9)
- Added or removed users and groups from project approval groups ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/213603) in GitLab 13.2)
@@ -203,7 +212,7 @@ Events visible in Audit Events views until more events are logged.
### "Deleted User" events
-Audit events can be created for a user after the user is deleted. The user name associated with the event is set to
+Audit events can be created for a user after the user is deleted. The user name associated with the event is set to
"Deleted User" because the actual user name is unknowable. For example, if a deleted user's access to a project is
removed automatically due to expiration, the audit event is created for "Deleted User". We are [investigating](https://gitlab.com/gitlab-org/gitlab/-/issues/343933)
whether this is avoidable.
diff --git a/doc/administration/auth/atlassian.md b/doc/administration/auth/atlassian.md
index 14c48231a3d..5fa10c4c119 100644
--- a/doc/administration/auth/atlassian.md
+++ b/doc/administration/auth/atlassian.md
@@ -50,9 +50,10 @@ To enable the Atlassian OmniAuth provider for passwordless authentication you mu
gitlab_rails['omniauth_providers'] = [
{
name: "atlassian_oauth2",
+ # label: "Provider name", # optional label for login button, defaults to "Atlassian"
app_id: "YOUR_CLIENT_ID",
app_secret: "YOUR_CLIENT_SECRET",
- args: { scope: 'offline_access read:jira-user read:jira-work', prompt: 'consent' }
+ args: { scope: "offline_access read:jira-user read:jira-work", prompt: "consent" }
}
]
```
@@ -60,10 +61,12 @@ To enable the Atlassian OmniAuth provider for passwordless authentication you mu
For installations from source:
```yaml
- - name: "atlassian_oauth2",
- app_id: "YOUR_CLIENT_ID",
- app_secret: "YOUR_CLIENT_SECRET",
- args: { scope: 'offline_access read:jira-user read:jira-work', prompt: 'consent' }
+ - { name: "atlassian_oauth2",
+ # label: "Provider name", # optional label for login button, defaults to "Atlassian"
+ app_id: "YOUR_CLIENT_ID",
+ app_secret: "YOUR_CLIENT_SECRET",
+ args: { scope: "offline_access read:jira-user read:jira-work", prompt: "consent" }
+ }
```
1. Change `YOUR_CLIENT_ID` and `YOUR_CLIENT_SECRET` to the Client credentials you received in [application registration](#atlassian-application-registration) steps.
diff --git a/doc/administration/auth/authentiq.md b/doc/administration/auth/authentiq.md
index 19ee143a72a..4220e552196 100644
--- a/doc/administration/auth/authentiq.md
+++ b/doc/administration/auth/authentiq.md
@@ -36,12 +36,13 @@ Authentiq generates a Client ID and the accompanying Client Secret for you to us
```ruby
gitlab_rails['omniauth_providers'] = [
{
- "name" => "authentiq",
- "app_id" => "YOUR_CLIENT_ID",
- "app_secret" => "YOUR_CLIENT_SECRET",
- "args" => {
- "scope": 'aq:name email~rs address aq:push'
- }
+ name: "authentiq",
+ # label: "Provider name", # optional label for login button, defaults to "Authentiq"
+ app_id: "YOUR_CLIENT_ID",
+ app_secret: "YOUR_CLIENT_SECRET",
+ args: {
+ "scope": 'aq:name email~rs address aq:push'
+ }
}
]
```
@@ -50,6 +51,7 @@ Authentiq generates a Client ID and the accompanying Client Secret for you to us
```yaml
- { name: 'authentiq',
+ # label: 'Provider name', # optional label for login button, defaults to "Authentiq"
app_id: 'YOUR_CLIENT_ID',
app_secret: 'YOUR_CLIENT_SECRET',
args: {
diff --git a/doc/administration/auth/cognito.md b/doc/administration/auth/cognito.md
index d137489a838..718a2919ed0 100644
--- a/doc/administration/auth/cognito.md
+++ b/doc/administration/auth/cognito.md
@@ -56,25 +56,25 @@ Include the code block in the `/etc/gitlab/gitlab.rb` file:
gitlab_rails['omniauth_allow_single_sign_on'] = ['cognito']
gitlab_rails['omniauth_providers'] = [
{
- "name" => "cognito",
- # "label" => "Cognito",
- # "icon" => nil, # Optional icon URL
- "app_id" => "CLIENT ID",
- "app_secret" => "CLIENT SECRET",
- "args" => {
- "scope" => "openid profile email",
+ name: "cognito",
+ label: "Provider name", # optional label for login button, defaults to "Cognito"
+ icon: nil, # Optional icon URL
+ app_id: "CLIENT ID",
+ app_secret: "CLIENT SECRET",
+ args: {
+ scope: "openid profile email",
client_options: {
- 'site' => 'https://your_domain.auth.your_region.amazoncognito.com',
- 'authorize_url' => '/oauth2/authorize',
- 'token_url' => '/oauth2/token',
- 'user_info_url' => '/oauth2/userInfo'
+ site: "https://your_domain.auth.your_region.amazoncognito.com",
+ authorize_url: "/oauth2/authorize",
+ token_url: "/oauth2/token",
+ user_info_url: "/oauth2/userInfo"
},
user_response_structure: {
root_path: [],
- id_path: ['sub'],
- attributes: { nickname: 'email', name: 'email', email: 'email' }
+ id_path: ["sub"],
+ attributes: { nickname: "email", name: "email", email: "email" }
},
- name: 'cognito',
+ name: "cognito",
strategy_class: "OmniAuth::Strategies::OAuth2Generic"
}
}
diff --git a/doc/administration/auth/crowd.md b/doc/administration/auth/crowd.md
index 466e208a52e..265bba8a9b1 100644
--- a/doc/administration/auth/crowd.md
+++ b/doc/administration/auth/crowd.md
@@ -46,11 +46,12 @@ this provider also allows Crowd authentication for Git-over-https requests.
```ruby
gitlab_rails['omniauth_providers'] = [
{
- "name" => "crowd",
- "args" => {
- "crowd_server_url" => "CROWD_SERVER_URL",
- "application_name" => "YOUR_APP_NAME",
- "application_password" => "YOUR_APP_PASSWORD"
+ name: "crowd",
+ # label: "Provider name", # optional label for login button, defaults to "Crowd"
+ args: {
+ crowd_server_url: "CROWD_SERVER_URL",
+ application_name: "YOUR_APP_NAME",
+ application_password: "YOUR_APP_PASSWORD"
}
}
]
@@ -60,6 +61,7 @@ this provider also allows Crowd authentication for Git-over-https requests.
```yaml
- { name: 'crowd',
+ # label: 'Provider name', # optional label for login button, defaults to "Crowd"
args: {
crowd_server_url: 'CROWD_SERVER_URL',
application_name: 'YOUR_APP_NAME',
diff --git a/doc/administration/auth/jwt.md b/doc/administration/auth/jwt.md
index 26e523cb802..9298b04cbc1 100644
--- a/doc/administration/auth/jwt.md
+++ b/doc/administration/auth/jwt.md
@@ -8,7 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# JWT OmniAuth provider **(FREE SELF)**
To enable the JWT OmniAuth provider, you must register your application with JWT.
-JWT will provide you with a secret key for you to use.
+JWT provides you with a secret key for you to use.
1. On your GitLab server, open the configuration file.
@@ -32,14 +32,15 @@ JWT will provide you with a secret key for you to use.
```ruby
gitlab_rails['omniauth_providers'] = [
- { name: 'jwt',
+ { name: "jwt",
+ label: "Provider name", # optional label for login button, defaults to "Jwt"
args: {
- secret: 'YOUR_APP_SECRET',
- algorithm: 'HS256', # Supported algorithms: 'RS256', 'RS384', 'RS512', 'ES256', 'ES384', 'ES512', 'HS256', 'HS384', 'HS512'
- uid_claim: 'email',
- required_claims: ['name', 'email'],
- info_map: { name: 'name', email: 'email' },
- auth_url: 'https://example.com/',
+ secret: "YOUR_APP_SECRET",
+ algorithm: "HS256", # Supported algorithms: "RS256", "RS384", "RS512", "ES256", "ES384", "ES512", "HS256", "HS384", "HS512"
+ uid_claim: "email",
+ required_claims: ["name", "email"],
+ info_map: { name: "name", email: "email" },
+ auth_url: "https://example.com/",
valid_within: 3600 # 1 hour
}
}
@@ -50,6 +51,7 @@ JWT will provide you with a secret key for you to use.
```yaml
- { name: 'jwt',
+ label: 'Provider name', # optional label for login button, defaults to "Jwt"
args: {
secret: 'YOUR_APP_SECRET',
algorithm: 'HS256', # Supported algorithms: 'RS256', 'RS384', 'RS512', 'ES256', 'ES384', 'ES512', 'HS256', 'HS384', 'HS512'
@@ -72,9 +74,9 @@ JWT will provide you with a secret key for you to use.
installed GitLab via Omnibus or from source respectively.
On the sign in page there should now be a JWT icon below the regular sign in form.
-Click the icon to begin the authentication process. JWT will ask the user to
+Click the icon to begin the authentication process. JWT asks the user to
sign in and authorize the GitLab application. If everything goes well, the user
-will be redirected to GitLab and will be signed in.
+is redirected to GitLab and signed in.
<!-- ## Troubleshooting
diff --git a/doc/administration/auth/ldap/index.md b/doc/administration/auth/ldap/index.md
index 9047cfae1e9..f551c362784 100644
--- a/doc/administration/auth/ldap/index.md
+++ b/doc/administration/auth/ldap/index.md
@@ -23,7 +23,7 @@ Users added through LDAP:
- Take a [licensed seat](../../../subscriptions/self_managed/index.md#billable-users).
- Can authenticate with Git using either their GitLab username or their email and LDAP password,
- even if password authentication for Git
+ even if password authentication for Git
[is disabled](../../../user/admin_area/settings/sign_in_restrictions.md#password-authentication-enabled).
The LDAP DN is associated with existing GitLab users when:
@@ -41,7 +41,7 @@ If an existing GitLab user wants to enable LDAP sign-in for themselves, they sho
GitLab has multiple mechanisms to verify a user is still active in LDAP. If the user is no longer active in
LDAP, they are placed in an `ldap_blocked` status and are signed out. They are unable to sign in using any authentication provider until they are
-reactivated in LDAP.
+reactivated in LDAP.
Users are considered inactive in LDAP when they:
@@ -52,7 +52,8 @@ Users are considered inactive in LDAP when they:
Status is checked for all LDAP users:
-- When signing in using any authentication provider.
+- When signing in using any authentication provider. [In GitLab 14.4 and earlier](https://gitlab.com/gitlab-org/gitlab/-/issues/343298), status was
+ checked only when signing in using LDAP directly.
- Once per hour for active web sessions or Git requests using tokens or SSH keys.
- When performing Git over HTTP requests using LDAP username and password.
- Once per day during [User Sync](ldap_synchronization.md#user-sync).
@@ -221,6 +222,51 @@ These LDAP sync configuration settings are available:
| `external_groups` | An array of CNs of groups containing users that should be considered external. Not `cn=interns` or the full DN. | **{dotted-circle}** No | `['interns', 'contractors']` |
| `sync_ssh_keys` | The LDAP attribute containing a user's public SSH key. | **{dotted-circle}** No | `'sshPublicKey'` or false if not set |
+### Use multiple LDAP servers **(PREMIUM SELF)**
+
+If you have users on multiple LDAP servers, you can configure GitLab to use them. To add additional LDAP servers:
+
+1. Duplicate the [`main` LDAP configuration](#configure-ldap).
+1. Edit each duplicate configuration with the details of the additional servers.
+ - For each additional server, choose a different provider ID, like `main`, `secondary`, or `tertiary`. Use lowercase
+ alphanumeric characters. GitLab uses the provider ID to associate each user with a specific LDAP server.
+ - For each entry, use a unique `label` value. These values are used for the tab names on the sign-in page.
+
+#### Example of multiple LDAP servers
+
+The following example shows how to configure three LDAP servers in `gitlab.rb`:
+
+```ruby
+gitlab_rails['ldap_enabled'] = true
+gitlab_rails['ldap_servers'] = {
+'main' => {
+ 'label' => 'GitLab AD',
+ 'host' => 'ad.example.org',
+ 'port' => 636,
+ ...
+ },
+
+'secondary' => {
+ 'label' => 'GitLab Secondary AD',
+ 'host' => 'ad-secondary.example.net',
+ 'port' => 636,
+ ...
+ },
+
+'tertiary' => {
+ 'label' => 'GitLab Tertiary AD',
+ 'host' => 'ad-tertiary.example.net',
+ 'port' => 636,
+ ...
+ }
+
+}
+```
+
+This example results in the following sign-in page:
+
+![Multiple LDAP servers sign in](img/multi_login.gif)
+
### Set up LDAP user filter
To limit all GitLab access to a subset of the LDAP users on your LDAP server, first narrow the
@@ -451,56 +497,6 @@ If initially your LDAP configuration looked like:
1. [Restart GitLab](../../restart_gitlab.md#installations-from-source) for the changes to take effect.
-## Multiple LDAP servers **(PREMIUM SELF)**
-
-With GitLab, you can configure multiple LDAP servers that your GitLab instance
-connects to.
-
-To add another LDAP server:
-
-1. Duplicate the settings under [the main configuration](#configure-ldap).
-1. Edit them to match the additional LDAP server.
-
-Be sure to choose a different provider ID made of letters a-z and numbers 0-9.
-This ID is stored in the database so that GitLab can remember which LDAP
-server a user belongs to.
-
-![Multiple LDAP Servers Sign in](img/multi_login.gif)
-
-Based on the example illustrated on the image above,
-our `gitlab.rb` configuration would look like:
-
-```ruby
-gitlab_rails['ldap_enabled'] = true
-gitlab_rails['ldap_servers'] = {
-'main' => {
- 'label' => 'GitLab AD',
- 'host' => 'ad.example.org',
- 'port' => 636,
- ...
- },
-
-'secondary' => {
- 'label' => 'GitLab Secondary AD',
- 'host' => 'ad-secondary.example.net',
- 'port' => 636,
- ...
- },
-
-'tertiary' => {
- 'label' => 'GitLab Tertiary AD',
- 'host' => 'ad-tertiary.example.net',
- 'port' => 636,
- ...
- }
-
-}
-```
-
-If you configure multiple LDAP servers, use a unique naming convention for the
-`label` section of each entry. That label is used as the display name of the tab
-shown on the sign-in page.
-
## Disable anonymous LDAP authentication
GitLab doesn't support TLS client authentication. Complete these steps on your LDAP server.
diff --git a/doc/administration/auth/ldap/ldap-troubleshooting.md b/doc/administration/auth/ldap/ldap-troubleshooting.md
index aa40060c4c1..63e4490e332 100644
--- a/doc/administration/auth/ldap/ldap-troubleshooting.md
+++ b/doc/administration/auth/ldap/ldap-troubleshooting.md
@@ -106,7 +106,7 @@ here are some questions to ask yourself:
- Does the user pass through the [configured `user_filter`](index.md#set-up-ldap-user-filter)?
If one is not configured, this question can be ignored. If it is, then the
user must also pass through this filter to be allowed to sign in.
- - Refer to our docs on [debugging the `user_filter`](#debug-ldap-user-filter).
+ - Refer to our documentation on [debugging the `user_filter`](#debug-ldap-user-filter).
If the above are both okay, the next place to look for the problem is
the logs themselves while reproducing the issue.
@@ -316,7 +316,7 @@ LDAP search error: No Such Object
User Update (0.4ms) UPDATE "users" SET "state" = $1, "updated_at" = $2 WHERE "users"."id" = $3 [["state", "ldap_blocked"], ["updated_at", "2019-10-18 15:46:22.902177"], ["id", 20]]
```
-Once the user is found in LDAP, the rest of the output updates the GitLab
+After the user is found in LDAP, the rest of the output updates the GitLab
database with any changes.
#### Query a user in LDAP
@@ -337,8 +337,8 @@ Gitlab::Auth::Ldap::Person.find_by_uid('<uid>', adapter)
#### Membership(s) not granted
Sometimes you may think a particular user should be added to a GitLab group via
-LDAP group sync, but for some reason it's not happening. There are several
-things to check to debug the situation.
+LDAP group sync, but for some reason it's not happening. You can check several
+things to debug the situation.
- Ensure LDAP configuration has a `group_base` specified.
[This configuration](ldap_synchronization.md#group-sync) is required for group sync to work properly.
@@ -421,7 +421,7 @@ Started syncing 'ldapmain' provider for 'my_group' group
```
The following entry shows an array of all user DNs GitLab sees in the LDAP server.
-These are the users for a single LDAP group, not a GitLab group. If
+These DNs are the users for a single LDAP group, not a GitLab group. If
you have multiple LDAP groups linked to this GitLab group, you see multiple
log entries like this - one for each LDAP group. If you don't see an LDAP user
DN in this log entry, LDAP is not returning the user when we do the lookup.
@@ -545,7 +545,7 @@ updates the stored DN to the new value so both values now match what's in
LDAP.
If the email has changed and the DN has not, GitLab finds the user with
-the DN and update its own record of the user's email to match the one in LDAP.
+the DN and updates its own record of the user's email to match the one in LDAP.
However, if the primary email _and_ the DN change in LDAP, then GitLab
has no way of identifying the correct LDAP record of the user and, as a
@@ -563,7 +563,7 @@ email address are removed first. This is because emails have to be unique in Git
Go to the [rails console](#rails-console) and then run:
```ruby
-# Each entry will have to include the old username and the new email
+# Each entry must include the old username and the new email
emails = {
'ORIGINAL_USERNAME' => 'NEW_EMAIL_ADDRESS',
...
@@ -582,8 +582,8 @@ for each of these users.
## Expired license causes errors with multiple LDAP servers
-Using [multiple LDAP servers](index.md#multiple-ldap-servers) requires a valid license. An expired
-license can cause:
+Using [multiple LDAP servers](index.md#use-multiple-ldap-servers) requires a valid license. An expired license can
+cause:
- `502` errors in the web interface.
- The following error in logs (the actual strategy name depends on the name configured in `/etc/gitlab/gitlab.rb`):
@@ -686,7 +686,7 @@ For more information, see the [official `ldapsearch` documentation](https://linu
### Using **AdFind** (Windows)
-You can use the [`AdFind`](https://social.technet.microsoft.com/wiki/contents/articles/7535.adfind-command-examples.aspx) utility (on Windows based systems) to test that your LDAP server is accessible and authentication is working correctly. This is a freeware utility built by [Joe Richards](http://www.joeware.net/freetools/tools/adfind/index.htm).
+You can use the [`AdFind`](https://social.technet.microsoft.com/wiki/contents/articles/7535.adfind-command-examples.aspx) utility (on Windows based systems) to test that your LDAP server is accessible and authentication is working correctly. AdFind is a freeware utility built by [Joe Richards](http://www.joeware.net/freetools/tools/adfind/index.htm).
**Return all objects**
@@ -719,9 +719,8 @@ For instructions about how to use the rails console, refer to this
#### Enable debug output
-This provides debug output that is useful to see
-what GitLab is doing and with what. This value is not persisted, and is only
-enabled for this session in the rails console.
+This provides debug output that shows what GitLab is doing and with what.
+This value is not persisted, and is only enabled for this session in the Rails console.
To enable debug output in the rails console, [enter the rails
console](#rails-console) and run:
diff --git a/doc/administration/auth/ldap/ldap_synchronization.md b/doc/administration/auth/ldap/ldap_synchronization.md
index 2673a8374ec..8ccd8fecbcf 100644
--- a/doc/administration/auth/ldap/ldap_synchronization.md
+++ b/doc/administration/auth/ldap/ldap_synchronization.md
@@ -127,8 +127,8 @@ following.
1. [Restart GitLab](../../restart_gitlab.md#installations-from-source) for the changes to take effect.
-To take advantage of group sync, group owners or maintainers must [create one
-or more LDAP group links](#add-group-links).
+To take advantage of group sync, group Owners or users with the [Maintainer role](../../../user/permissions.md) must
+[create one or more LDAP group links](#add-group-links).
### Add group links
diff --git a/doc/administration/auth/oidc.md b/doc/administration/auth/oidc.md
index b8c443ae4d4..7ab1f2f5feb 100644
--- a/doc/administration/auth/oidc.md
+++ b/doc/administration/auth/oidc.md
@@ -35,22 +35,23 @@ The OpenID Connect provides you with a client's details and secret for you to us
```ruby
gitlab_rails['omniauth_providers'] = [
- { 'name' => 'openid_connect',
- 'label' => '<your_oidc_label>',
- 'icon' => '<custom_provider_icon>',
- 'args' => {
- 'name' => 'openid_connect',
- 'scope' => ['openid','profile','email'],
- 'response_type' => 'code',
- 'issuer' => '<your_oidc_url>',
- 'discovery' => true,
- 'client_auth_method' => 'query',
- 'uid_field' => '<uid_field>',
- 'send_scope_to_token_endpoint' => 'false',
- 'client_options' => {
- 'identifier' => '<your_oidc_client_id>',
- 'secret' => '<your_oidc_client_secret>',
- 'redirect_uri' => '<your_gitlab_url>/users/auth/openid_connect/callback'
+ {
+ name: "openid_connect",
+ label: "Provider name", # optional label for login button, defaults to "Openid Connect"
+ icon: "<custom_provider_icon>",
+ args: {
+ name: "openid_connect",
+ scope: ["openid","profile","email"],
+ response_type: "code",
+ issuer: "<your_oidc_url>",
+ discovery: true,
+ client_auth_method: "query",
+ uid_field: "<uid_field>",
+ send_scope_to_token_endpoint: "false",
+ client_options: {
+ identifier: "<your_oidc_client_id>",
+ secret: "<your_oidc_client_secret>",
+ redirect_uri: "<your_gitlab_url>/users/auth/openid_connect/callback"
}
}
}
@@ -61,7 +62,7 @@ The OpenID Connect provides you with a client's details and secret for you to us
```yaml
- { name: 'openid_connect',
- label: '<your_oidc_label>',
+ label: 'Provider name', # optional label for login button, defaults to "Openid Connect"
icon: '<custom_provider_icon>',
args: {
name: 'openid_connect',
@@ -136,20 +137,20 @@ for more details:
```ruby
gitlab_rails['omniauth_providers'] = [
{
- 'name' => 'openid_connect',
- 'label' => 'Google OpenID',
- 'args' => {
- 'name' => 'openid_connect',
- 'scope' => ['openid', 'profile', 'email'],
- 'response_type' => 'code',
- 'issuer' => 'https://accounts.google.com',
- 'client_auth_method' => 'query',
- 'discovery' => true,
- 'uid_field' => 'preferred_username',
- 'client_options' => {
- 'identifier' => '<YOUR PROJECT CLIENT ID>',
- 'secret' => '<YOUR PROJECT CLIENT SECRET>',
- 'redirect_uri' => 'https://example.com/users/auth/openid_connect/callback',
+ name: "openid_connect",
+ label: "Google OpenID", # optional label for login button, defaults to "Openid Connect"
+ args: {
+ name: "openid_connect",
+ scope: ["openid", "profile", "email"],
+ response_type: "code",
+ issuer: "https://accounts.google.com",
+ client_auth_method: "query",
+ discovery: true,
+ uid_field: "preferred_username",
+ client_options: {
+ identifier: "<YOUR PROJECT CLIENT ID>",
+ secret: "<YOUR PROJECT CLIENT SECRET>",
+ redirect_uri: "https://example.com/users/auth/openid_connect/callback",
}
}
}
@@ -173,20 +174,20 @@ Example Omnibus configuration block:
```ruby
gitlab_rails['omniauth_providers'] = [
{
- 'name' => 'openid_connect',
- 'label' => 'Azure OIDC',
- 'args' => {
- 'name' => 'openid_connect',
- 'scope' => ['openid', 'profile', 'email'],
- 'response_type' => 'code',
- 'issuer' => 'https://login.microsoftonline.com/<YOUR-TENANT-ID>/v2.0',
- 'client_auth_method' => 'query',
- 'discovery' => true,
- 'uid_field' => 'preferred_username',
- 'client_options' => {
- 'identifier' => '<YOUR APP CLIENT ID>',
- 'secret' => '<YOUR APP CLIENT SECRET>',
- 'redirect_uri' => 'https://gitlab.example.com/users/auth/openid_connect/callback'
+ name: "openid_connect",
+ label: "Azure OIDC", # optional label for login button, defaults to "Openid Connect"
+ args: {
+ name: "openid_connect",
+ scope: ["openid", "profile", "email"],
+ response_type: "code",
+ issuer: "https://login.microsoftonline.com/<YOUR-TENANT-ID>/v2.0",
+ client_auth_method: "query",
+ discovery: true,
+ uid_field: "preferred_username",
+ client_options: {
+ identifier: "<YOUR APP CLIENT ID>",
+ secret: "<YOUR APP CLIENT SECRET>",
+ redirect_uri: "https://gitlab.example.com/users/auth/openid_connect/callback"
}
}
}
@@ -302,21 +303,21 @@ The trailing forward slash is required.
```ruby
gitlab_rails['omniauth_providers'] = [
{
- 'name' => 'openid_connect',
- 'label' => 'Azure B2C OIDC',
- 'args' => {
- 'name' => 'openid_connect',
- 'scope' => ['openid'],
- 'response_mode' => 'query',
- 'response_type' => 'id_token',
- 'issuer' => 'https://<YOUR-DOMAIN>/tfp/<YOUR-TENANT-ID>/b2c_1a_signup_signin/v2.0/',
- 'client_auth_method' => 'query',
- 'discovery' => true,
- 'send_scope_to_token_endpoint' => true,
- 'client_options' => {
- 'identifier' => '<YOUR APP CLIENT ID>',
- 'secret' => '<YOUR APP CLIENT SECRET>',
- 'redirect_uri' => 'https://gitlab.example.com/users/auth/openid_connect/callback'
+ name: "openid_connect",
+ label: "Azure B2C OIDC", # optional label for login button, defaults to "Openid Connect"
+ args: {
+ name: "openid_connect",
+ scope: ["openid"],
+ response_mode: "query",
+ response_type: "id_token",
+ issuer: "https://<YOUR-DOMAIN>/tfp/<YOUR-TENANT-ID>/b2c_1a_signup_signin/v2.0/",
+ client_auth_method: "query",
+ discovery: true,
+ send_scope_to_token_endpoint: true,
+ client_options: {
+ identifier: "<YOUR APP CLIENT ID>",
+ secret: "<YOUR APP CLIENT SECRET>",
+ redirect_uri: "https://gitlab.example.com/users/auth/openid_connect/callback"
}
}
}]
@@ -359,20 +360,20 @@ Example Omnibus configuration block:
```ruby
gitlab_rails['omniauth_providers'] = [
{
- 'name' => 'openid_connect',
- 'label' => 'Keycloak',
- 'args' => {
- 'name' => 'openid_connect',
- 'scope' => ['openid', 'profile', 'email'],
- 'response_type' => 'code',
- 'issuer' => 'https://keycloak.example.com/auth/realms/myrealm',
- 'client_auth_method' => 'query',
- 'discovery' => true,
- 'uid_field' => 'preferred_username',
- 'client_options' => {
- 'identifier' => '<YOUR CLIENT ID>',
- 'secret' => '<YOUR CLIENT SECRET>',
- 'redirect_uri' => 'https://gitlab.example.com/users/auth/openid_connect/callback'
+ name: "openid_connect",
+ label: "Keycloak", # optional label for login button, defaults to "Openid Connect"
+ args: {
+ name: "openid_connect",
+ scope: ["openid", "profile", "email"],
+ response_type: "code",
+ issuer: "https://keycloak.example.com/auth/realms/myrealm",
+ client_auth_method: "query",
+ discovery: true,
+ uid_field: "preferred_username",
+ client_options: {
+ identifier: "<YOUR CLIENT ID>",
+ secret: "<YOUR CLIENT SECRET>",
+ redirect_uri: "https://gitlab.example.com/users/auth/openid_connect/callback"
}
}
}
@@ -436,21 +437,21 @@ To use symmetric key encryption:
```ruby
gitlab_rails['omniauth_providers'] = [
{
- 'name' => 'openid_connect',
- 'label' => 'Keycloak',
- 'args' => {
- 'name' => 'openid_connect',
- 'scope' => ['openid', 'profile', 'email'],
- 'response_type' => 'code',
- 'issuer' => 'https://keycloak.example.com/auth/realms/myrealm',
- 'client_auth_method' => 'query',
- 'discovery' => true,
- 'uid_field' => 'preferred_username',
- 'jwt_secret_base64' => '<YOUR BASE64-ENCODED SECRET>',
- 'client_options' => {
- 'identifier' => '<YOUR CLIENT ID>',
- 'secret' => '<YOUR CLIENT SECRET>',
- 'redirect_uri' => 'https://gitlab.example.com/users/auth/openid_connect/callback'
+ name: "openid_connect",
+ label: "Keycloak", # optional label for login button, defaults to "Openid Connect"
+ args: {
+ name: "openid_connect",
+ scope: ["openid", "profile", "email"],
+ response_type: "code",
+ issuer: "https://keycloak.example.com/auth/realms/myrealm",
+ client_auth_method: "query",
+ discovery: true,
+ uid_field: "preferred_username",
+ jwt_secret_base64: "<YOUR BASE64-ENCODED SECRET>",
+ client_options: {
+ identifier: "<YOUR CLIENT ID>",
+ secret: "<YOUR CLIENT SECRET>",
+ redirect_uri: "https://gitlab.example.com/users/auth/openid_connect/callback"
}
}
}
diff --git a/doc/administration/cicd.md b/doc/administration/cicd.md
index d53290f1d5d..a7bd07d5d38 100644
--- a/doc/administration/cicd.md
+++ b/doc/administration/cicd.md
@@ -58,9 +58,9 @@ For Omnibus GitLab installations:
sudo gitlab-ctl reconfigure
```
-## Set the `needs:` job limit **(FREE SELF)**
+## Set the `needs` job limit **(FREE SELF)**
-The maximum number of jobs that can be defined in `needs:` defaults to 50.
+The maximum number of jobs that can be defined in `needs` defaults to 50.
A GitLab administrator with [access to the GitLab Rails console](operations/rails_console.md#starting-a-rails-console-session)
can choose a custom limit. For example, to set the limit to `100`:
diff --git a/doc/administration/clusters/kas.md b/doc/administration/clusters/kas.md
index 93b24007de8..b5c0a6ee76a 100644
--- a/doc/administration/clusters/kas.md
+++ b/doc/administration/clusters/kas.md
@@ -4,13 +4,15 @@ group: Configure
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
---
-# Install the Kubernetes Agent Server (KAS) **(PREMIUM SELF)**
+# Install the GitLab Agent Server (KAS) **(FREE SELF)**
-The Kubernetes Agent Server (KAS) is a GitLab backend service dedicated to
-managing [Kubernetes Agents](../../user/clusters/agent/index.md).
+> [Moved](https://gitlab.com/groups/gitlab-org/-/epics/6290) from GitLab Premium to GitLab Free in 14.5.
+
+The GitLab Agent Server (KAS) is a GitLab backend service dedicated to
+managing the [GitLab Agent](../../user/clusters/agent/index.md).
The KAS is already installed and available in GitLab.com under `wss://kas.gitlab.com`.
-See [how to use GitLab.com's KAS](../../user/clusters/agent/install/index.md#set-up-the-kubernetes-agent-server).
+See [how to use GitLab.com's KAS](../../user/clusters/agent/install/index.md#set-up-the-agent-server).
This document describes how to install a KAS for GitLab self-managed instances.
## Installation options
@@ -27,7 +29,7 @@ You can also opt to use an [external KAS](#use-an-external-kas-installation).
For [Omnibus](https://docs.gitlab.com/omnibus/) package installations:
-1. Edit `/etc/gitlab/gitlab.rb` to enable the Kubernetes Agent Server:
+1. Edit `/etc/gitlab/gitlab.rb` to enable the Agent Server:
```ruby
gitlab_kas['enable'] = true
diff --git a/doc/administration/compliance.md b/doc/administration/compliance.md
index a05495c024e..7cecc0c30fd 100644
--- a/doc/administration/compliance.md
+++ b/doc/administration/compliance.md
@@ -73,7 +73,7 @@ These features can also help with compliance requirements:
- [**Generate reports on permission levels of users**](../user/admin_area/index.md#user-permission-export) (for
instances): Administrators can generate a report listing all users' access permissions for groups and projects in the
instance.
-- [**Lock project membership to group**](../user/group/index.md#prevent-members-from-being-added-to-a-group) (for
+- [**Lock project membership to group**](../user/group/index.md#prevent-members-from-being-added-to-projects-in-a-group) (for
groups): Group owners can prevent new members from being added to projects within a group.
- [**LDAP group sync**](auth/ldap/ldap_synchronization.md#group-sync) (for instances): Gives administrators the ability
to automatically sync groups and manage SSH keys, permissions, and authentication, so you can focus on building your
diff --git a/doc/administration/database_load_balancing.md b/doc/administration/database_load_balancing.md
index 45f27a8a8f2..92b8342f251 100644
--- a/doc/administration/database_load_balancing.md
+++ b/doc/administration/database_load_balancing.md
@@ -1,272 +1,9 @@
---
-stage: Enablement
-group: Database
-info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+redirect_to: 'postgresql/database_load_balancing.md'
+remove_date: '2022-02-19'
---
-# Database Load Balancing **(FREE SELF)**
+This file was moved to [another location](postgresql/database_load_balancing.md).
-> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/1283) in [GitLab Premium](https://about.gitlab.com/pricing/) 9.0.
-> - [Moved](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/60894) from GitLab Premium to GitLab Free in 14.0.
-
-Distribute read-only queries among multiple database servers.
-
-## Overview
-
-Database load balancing improves the distribution of database workloads across
-multiple computing resources. Load balancing aims to optimize resource use,
-maximize throughput, minimize response time, and avoid overload of any single
-resource. Using multiple components with load balancing instead of a single
-component may increase reliability and availability through redundancy.
-[_Wikipedia article_](https://en.wikipedia.org/wiki/Load_balancing_(computing))
-
-When database load balancing is enabled in GitLab, the load is balanced using
-a simple round-robin algorithm, without any external dependencies such as Redis.
-
-In the following image, you can see the load is balanced rather evenly among
-all the secondaries (`db4`, `db5`, `db6`). Because `SELECT` queries are not
-sent to the primary (unless necessary), the primary (`db3`) hardly has any load.
-
-![DB load balancing graph](img/db_load_balancing_postgres_stats.png)
-
-## Requirements
-
-For load balancing to work, you need at least PostgreSQL 11 or newer,
-[**MySQL is not supported**](../install/requirements.md#database). You also need to make sure that you have
-at least 1 secondary in [hot standby](https://www.postgresql.org/docs/11/hot-standby.html) mode.
-
-Load balancing also requires that the configured hosts **always** point to the
-primary, even after a database failover. Furthermore, the additional hosts to
-balance load among must **always** point to secondary databases. This means that
-you should put a load balancer in front of every database, and have GitLab connect
-to those load balancers.
-
-For example, say you have a primary (`db1.gitlab.com`) and two secondaries,
-`db2.gitlab.com` and `db3.gitlab.com`. For this setup, you need to have 3
-load balancers, one for every host. For example:
-
-- `primary.gitlab.com` forwards to `db1.gitlab.com`
-- `secondary1.gitlab.com` forwards to `db2.gitlab.com`
-- `secondary2.gitlab.com` forwards to `db3.gitlab.com`
-
-Now let's say that a failover happens and db2 becomes the new primary. This
-means forwarding should now happen as follows:
-
-- `primary.gitlab.com` forwards to `db2.gitlab.com`
-- `secondary1.gitlab.com` forwards to `db1.gitlab.com`
-- `secondary2.gitlab.com` forwards to `db3.gitlab.com`
-
-GitLab does not take care of this for you, so you need to do so yourself.
-
-Finally, load balancing requires that GitLab can connect to all hosts using the
-same credentials and port as configured in the
-[Enabling load balancing](#enabling-load-balancing) section. Using
-different ports or credentials for different hosts is not supported.
-
-## Use cases
-
-- For GitLab instances with thousands of users and high traffic, you can use
- database load balancing to reduce the load on the primary database and
- increase responsiveness, thus resulting in faster page load inside GitLab.
-
-## Enabling load balancing
-
-For the environment in which you want to use load balancing, you'll need to add
-the following. This balances the load between `host1.example.com` and
-`host2.example.com`.
-
-**In Omnibus installations:**
-
-1. Edit `/etc/gitlab/gitlab.rb` and add the following line:
-
- ```ruby
- gitlab_rails['db_load_balancing'] = { 'hosts' => ['host1.example.com', 'host2.example.com'] }
- ```
-
-1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-
----
-
-**In installations from source:**
-
-1. Edit `/home/git/gitlab/config/database.yml` and add or amend the following lines:
-
- ```yaml
- production:
- username: gitlab
- database: gitlab
- encoding: unicode
- load_balancing:
- hosts:
- - host1.example.com
- - host2.example.com
- ```
-
-1. Save the file and [restart GitLab](restart_gitlab.md#installations-from-source) for the changes to take effect.
-
-### Load balancing for Sidekiq
-
-> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/334494) in GitLab 14.1, load balancing for Sidekick is enabled by default.
-
-Sidekiq jobs mostly write to the primary database, but there are read-only jobs that can benefit
-from the use of Sidekiq load balancing.
-These jobs can use load balancing and database replicas to read the application state.
-This allows to offload the primary database.
-
-For Sidekiq, we can define
-[data consistency](../development/sidekiq_style_guide.md#job-data-consistency-strategies)
-requirements for a specific job.
-
-## Service Discovery **(PREMIUM SELF)**
-
-> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/5883) in GitLab 11.0.
-
-Service discovery allows GitLab to automatically retrieve a list of secondary
-databases to use, instead of having to manually specify these in the
-`database.yml` configuration file. Service discovery works by periodically
-checking a DNS A record, using the IPs returned by this record as the addresses
-for the secondaries. For service discovery to work, all you need is a DNS server
-and an A record containing the IP addresses of your secondaries.
-
-To use service discovery you need to change your `database.yml` configuration
-file so it looks like the following:
-
-```yaml
-production:
- username: gitlab
- database: gitlab
- encoding: unicode
- load_balancing:
- discover:
- nameserver: localhost
- record: secondary.postgresql.service.consul
- record_type: A
- port: 8600
- interval: 60
- disconnect_timeout: 120
-```
-
-Here, the `discover:` section specifies the configuration details to use for
-service discovery.
-
-### Configuration
-
-The following options can be set:
-
-| Option | Description | Default |
-|----------------------|---------------------------------------------------------------------------------------------------|-----------|
-| `nameserver` | The nameserver to use for looking up the DNS record. | localhost |
-| `record` | The record to look up. This option is required for service discovery to work. | |
-| `record_type` | Optional record type to look up, this can be either A or SRV (GitLab 12.3 and later) | A |
-| `port` | The port of the nameserver. | 8600 |
-| `interval` | The minimum time in seconds between checking the DNS record. | 60 |
-| `disconnect_timeout` | The time in seconds after which an old connection is closed, after the list of hosts was updated. | 120 |
-| `use_tcp` | Lookup DNS resources using TCP instead of UDP | false |
-
-If `record_type` is set to `SRV`, then GitLab continues to use round-robin algorithm
-and ignores the `weight` and `priority` in the record. Since SRV records usually
-return hostnames instead of IPs, GitLab needs to look for the IPs of returned hostnames
-in the additional section of the SRV response. If no IP is found for a hostname, GitLab
-needs to query the configured `nameserver` for ANY record for each such hostname looking for A or AAAA
-records, eventually dropping this hostname from rotation if it can't resolve its IP.
-
-The `interval` value specifies the _minimum_ time between checks. If the A
-record has a TTL greater than this value, then service discovery honors said
-TTL. For example, if the TTL of the A record is 90 seconds, then service
-discovery waits at least 90 seconds before checking the A record again.
-
-When the list of hosts is updated, it might take a while for the old connections
-to be terminated. The `disconnect_timeout` setting can be used to enforce an
-upper limit on the time it takes to terminate all old database connections.
-
-Some nameservers (like [Consul](https://www.consul.io/docs/discovery/dns#udp-based-dns-queries)) can return a truncated list of hosts when
-queried over UDP. To overcome this issue, you can use TCP for querying by setting
-`use_tcp` to `true`.
-
-## Balancing queries
-
-Read-only `SELECT` queries balance among all the secondary hosts.
-Everything else (including transactions) executes on the primary.
-Queries such as `SELECT ... FOR UPDATE` are also executed on the primary.
-
-## Prepared statements
-
-Prepared statements don't work well with load balancing and are disabled
-automatically when load balancing is enabled. This should have no impact on
-response timings.
-
-## Primary sticking
-
-After a write has been performed, GitLab sticks to using the primary for a
-certain period of time, scoped to the user that performed the write. GitLab
-reverts back to using secondaries when they have either caught up, or after 30
-seconds.
-
-## Failover handling
-
-In the event of a failover or an unresponsive database, the load balancer
-tries to use the next available host. If no secondaries are available the
-operation is performed on the primary instead.
-
-If a connection error occurs while writing data, the
-operation is retried up to 3 times using an exponential back-off.
-
-When using load balancing, you should be able to safely restart a database server
-without it immediately leading to errors being presented to the users.
-
-## Logging
-
-The load balancer logs various events in
-[`database_load_balancing.log`](logs.md#database_load_balancinglog), such as
-
-- When a host is marked as offline
-- When a host comes back online
-- When all secondaries are offline
-- When a read is retried on a different host due to a query conflict
-
-The log is structured with each entry a JSON object containing at least:
-
-- An `event` field useful for filtering.
-- A human-readable `message` field.
-- Some event-specific metadata. For example, `db_host`
-- Contextual information that is always logged. For example, `severity` and `time`.
-
-For example:
-
-```json
-{"severity":"INFO","time":"2019-09-02T12:12:01.728Z","correlation_id":"abcdefg","event":"host_online","message":"Host came back online","db_host":"111.222.333.444","db_port":null,"tag":"rails.database_load_balancing","environment":"production","hostname":"web-example-1","fqdn":"gitlab.example.com","path":null,"params":null}
-```
-
-## Handling Stale Reads **(PREMIUM SELF)**
-
-> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/3526) in GitLab 10.3.
-
-To prevent reading from an outdated secondary the load balancer checks if it
-is in sync with the primary. If the data is determined to be recent enough the
-secondary is used, otherwise it is ignored. To reduce the overhead of
-these checks we only perform these checks at certain intervals.
-
-There are three configuration options that influence this behavior:
-
-| Option | Description | Default |
-|------------------------------|----------------------------------------------------------------------------------------------------------------|------------|
-| `max_replication_difference` | The amount of data (in bytes) a secondary is allowed to lag behind when it hasn't replicated data for a while. | 8 MB |
-| `max_replication_lag_time` | The maximum number of seconds a secondary is allowed to lag behind before we stop using it. | 60 seconds |
-| `replica_check_interval` | The minimum number of seconds we have to wait before checking the status of a secondary. | 60 seconds |
-
-The defaults should be sufficient for most users. Should you want to change them
-you can specify them in `config/database.yml` like so:
-
-```yaml
-production:
- username: gitlab
- database: gitlab
- encoding: unicode
- load_balancing:
- hosts:
- - host1.example.com
- - host2.example.com
- max_replication_difference: 16777216 # 16 MB
- max_replication_lag_time: 30
- replica_check_interval: 30
-```
+<!-- This redirect file can be deleted after <2022-02-19>. -->
+<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
diff --git a/doc/administration/environment_variables.md b/doc/administration/environment_variables.md
index 21e32d145bd..22159b6e9db 100644
--- a/doc/administration/environment_variables.md
+++ b/doc/administration/environment_variables.md
@@ -31,6 +31,7 @@ You can use the following environment variables to override certain values:
| `GITLAB_EMAIL_REPLY_TO` | string | The email address used in the **Reply-To** field in emails sent by GitLab. |
| `GITLAB_EMAIL_SUBJECT_SUFFIX` | string | The email subject suffix used in emails sent by GitLab. |
| `GITLAB_HOST` | string | The full URL of the GitLab server (including `http://` or `https://`). |
+| `GITLAB_MARKUP_TIMEOUT` | string | Timeout, in seconds, for `rest2html` and `pod2html` commands executed by the [`gitlab-markup` gem](https://gitlab.com/gitlab-org/gitlab-markup/). Default is `10`. |
| `GITLAB_ROOT_PASSWORD` | string | Sets the password for the `root` user on installation. |
| `GITLAB_SHARED_RUNNERS_REGISTRATION_TOKEN` | string | Sets the initial registration token used for runners. |
| `RAILS_ENV` | string | The Rails environment; can be one of `production`, `development`, `staging`, or `test`. |
diff --git a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md
index 3c7af309f78..b207be47aa1 100644
--- a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md
+++ b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md
@@ -60,11 +60,11 @@ What is not covered:
NOTE:
Before following any of those steps, make sure you have `root` access to the
-**secondary** to promote it, since there isn't provided an automated way to
+**secondary** to promote it, because there isn't provided an automated way to
promote a Geo replica and perform a failover.
NOTE:
-GitLab 13.9 through GitLab 14.3 are affected by a bug in which the Geo secondary site statuses will appear to stop updating and become unhealthy. For more information, see [Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](../../replication/troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
+GitLab 13.9 through GitLab 14.3 are affected by a bug in which the Geo secondary site statuses appear to stop updating and become unhealthy. For more information, see [Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](../../replication/troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
On the **secondary** site:
@@ -88,7 +88,7 @@ A common cause of replication failures is the data being missing on the
**primary** site - you can resolve these failures by restoring the data from backup,
or removing references to the missing data.
-The maintenance window won't end until Geo replication and verification is
+The maintenance window doesn't end until Geo replication and verification is
completely finished. To keep the window as short as possible, you should
ensure these processes are close to 100% as possible during active use.
@@ -122,7 +122,7 @@ follow these steps to avoid unnecessary data loss:
From this point, users are unable to view their data or make changes on the
**primary** site. They are also unable to log in to the **secondary** site.
- However, existing sessions need to work for the remainder of the maintenance period, and
+ However, existing sessions must work for the remainder of the maintenance period, and
so public data is accessible throughout.
1. Verify the **primary** site is blocked to HTTP traffic by visiting it in browser via
@@ -135,10 +135,10 @@ follow these steps to avoid unnecessary data loss:
1. On the **primary** site:
1. On the top bar, select **Menu > Admin**.
1. On the left sidebar, select **Monitoring > Background Jobs**.
- 1. On the Sidekiq dhasboard, select **Cron**.
+ 1. On the Sidekiq dashboard, select **Cron**.
1. Select `Disable All` to disable any non-Geo periodic background jobs.
1. Select `Enable` for the `geo_sidekiq_cron_config_worker` cron job.
- This job will re-enable several other cron jobs that are essential for planned
+ This job re-enables several other cron jobs that are essential for planned
failover to complete successfully.
1. Finish replicating and verifying all data:
@@ -176,7 +176,7 @@ follow these steps to avoid unnecessary data loss:
At this point, your **secondary** site contains an up-to-date copy of everything the
**primary** site has, meaning nothing is lost when you fail over.
-1. In this final step, you need to permanently disable the **primary** site.
+1. In this final step, you must permanently disable the **primary** site.
WARNING:
When the **primary** site goes offline, there may be data saved on the **primary** site
@@ -204,7 +204,7 @@ follow these steps to avoid unnecessary data loss:
```
NOTE:
- (**CentOS only**) In CentOS 6 or older, there is no easy way to prevent GitLab from being
+ (**CentOS only**) In CentOS 6 or older, it is challenging to prevent GitLab from being
started if the machine reboots isn't available (see [Omnibus GitLab issue #3058](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/3058)).
It may be safest to uninstall the GitLab package completely with `sudo yum remove gitlab-ee`.
@@ -216,7 +216,7 @@ follow these steps to avoid unnecessary data loss:
- If you do not have SSH access to the **primary** site, take the machine offline and
prevent it from rebooting. Since there are many ways you may prefer to accomplish
- this, we avoid a single recommendation. You may need to:
+ this, we avoid a single recommendation. You may have to:
- Reconfigure the load balancers.
- Change DNS records (for example, point the **primary** DNS record to the
diff --git a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md
index 8a4f2ed4306..5a6f9eb8be7 100644
--- a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md
+++ b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md
@@ -52,7 +52,7 @@ Before following any of those steps, make sure you have `root` access to the
promote a Geo replica and perform a failover.
NOTE:
-GitLab 13.9 through GitLab 14.3 are affected by a bug in which the Geo secondary site statuses will appear to stop updating and become unhealthy. For more information, see [Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](../../replication/troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
+GitLab 13.9 through GitLab 14.3 are affected by a bug in which the Geo secondary site statuses appears to stop updating and become unhealthy. For more information, see [Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](../../replication/troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
On the **secondary** site, navigate to the **Admin Area > Geo** dashboard to
review its status. Replicated objects (shown in green) should be close to 100%,
@@ -73,7 +73,7 @@ A common cause of replication failures is the data being missing on the
**primary** site - you can resolve these failures by restoring the data from backup,
or removing references to the missing data.
-The maintenance window won't end until Geo replication and verification is
+The maintenance window does not end until Geo replication and verification is
completely finished. To keep the window as short as possible, you should
ensure these processes are close to 100% as possible during active use.
@@ -123,7 +123,7 @@ follow these steps to avoid unnecessary data loss:
1. On the Sidekiq dhasboard, select **Cron**.
1. Select `Disable All` to disable any non-Geo periodic background jobs.
1. Select `Enable` for the `geo_sidekiq_cron_config_worker` cron job.
- This job will re-enable several other cron jobs that are essential for planned
+ This job re-enables several other cron jobs that are essential for planned
failover to complete successfully.
1. Finish replicating and verifying all data:
diff --git a/doc/administration/geo/index.md b/doc/administration/geo/index.md
index 30d8d765dc5..2cb1a424ce8 100644
--- a/doc/administration/geo/index.md
+++ b/doc/administration/geo/index.md
@@ -2,17 +2,19 @@
stage: Enablement
group: Geo
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
-type: howto
---
# Geo **(PREMIUM SELF)**
-Geo is the solution for widely distributed development teams and for providing a warm-standby as part of a disaster recovery strategy.
+Geo is the solution for widely distributed development teams and for providing
+a warm-standby as part of a disaster recovery strategy.
## Overview
WARNING:
-Geo undergoes significant changes from release to release. Upgrades **are** supported and [documented](#updating-geo), but you should ensure that you're using the right version of the documentation for your installation.
+Geo undergoes significant changes from release to release. Upgrades are
+supported and [documented](#updating-geo), but you should ensure that you're
+using the right version of the documentation for your installation.
Fetching large repositories can take a long time for teams located far from a single GitLab instance.
@@ -69,8 +71,9 @@ Keep in mind that:
- **Secondary** sites talk to the **primary** site to:
- Get user data for logins (API).
- Replicate repositories, LFS Objects, and Attachments (HTTPS + JWT).
-- In GitLab Premium 10.0 and later, the **primary** site no longer talks to **secondary** sites to notify for changes (API).
-- Pushing directly to a **secondary** site (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/releases/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
+- The **primary** site doesn't talk to **secondary** sites to notify for changes (API).
+- You can push directly to a **secondary** site (for both HTTP and SSH,
+ including Git LFS).
- There are [limitations](#limitations) when using Geo.
### Architecture
@@ -111,21 +114,18 @@ In **secondary** sites, there is an additional daemon: [Geo Log Cursor](#geo-log
The following are required to run Geo:
-- An operating system that supports OpenSSH 6.9+ (needed for
+- An operating system that supports OpenSSH 6.9 or later (needed for
[fast lookup of authorized SSH keys in the database](../operations/fast_ssh_key_lookup.md))
The following operating systems are known to ship with a current version of OpenSSH:
- - [CentOS](https://www.centos.org) 7.4+
- - [Ubuntu](https://ubuntu.com) 16.04+
-- PostgreSQL 12+ with [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication)
-- Git 2.9+
-- Git-lfs 2.4.2+ on the user side when using LFS
+ - [CentOS](https://www.centos.org) 7.4 or later
+ - [Ubuntu](https://ubuntu.com) 16.04 or later
+- PostgreSQL 12 or later with [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication)
+- Git 2.9 or later
+- Git-lfs 2.4.2 or later on the user side when using LFS
- All sites must run the same GitLab version.
Additionally, check the GitLab [minimum requirements](../../install/requirements.md),
-and we recommend you use:
-
-- At least GitLab Enterprise Edition 10.0 for basic Geo features.
-- The latest version for a better experience.
+and we recommend you use the latest version of GitLab for a better experience.
### Firewall rules
@@ -311,7 +311,8 @@ For answers to common questions, see the [Geo FAQ](replication/faq.md).
## Log files
-In GitLab 9.5 and later, Geo stores structured log messages in a `geo.log` file. For Omnibus installations, this file is at `/var/log/gitlab/gitlab-rails/geo.log`.
+Geo stores structured log messages in a `geo.log` file. For Omnibus GitLab
+installations, this file is at `/var/log/gitlab/gitlab-rails/geo.log`.
This file contains information about when Geo attempts to sync repositories and files. Each line in the file contains a separate JSON entry that can be ingested into. For example, Elasticsearch or Splunk.
diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md
index 88f1ad5b490..3cbde77903d 100644
--- a/doc/administration/geo/replication/configuration.md
+++ b/doc/administration/geo/replication/configuration.md
@@ -247,7 +247,7 @@ You can safely skip this step if your **primary** site uses a CA-issued HTTPS ce
If your **primary** site is using a self-signed certificate for *HTTPS* support, you
need to add that certificate to the **secondary** site's trust store. Retrieve the
certificate from the **primary** site and follow
-[these instructions](https://docs.gitlab.com/omnibus/settings/ssl.html)
+[these instructions](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates)
on the **secondary** site.
### Step 5. Enable Git access over HTTP/HTTPS
diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md
index c98436157fc..31bc473d74b 100644
--- a/doc/administration/geo/replication/datatypes.md
+++ b/doc/administration/geo/replication/datatypes.md
@@ -5,7 +5,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
type: howto
---
-# Geo data types support **(PREMIUM SELF)**
+# Supported Geo data types **(PREMIUM SELF)**
A Geo data type is a specific class of data that is required by one or more GitLab features to
store relevant information.
@@ -14,7 +14,7 @@ To replicate data produced by these features with Geo, we use several strategies
## Data types
-We currently distinguish between three different data types:
+We distinguish between three different data types:
- [Git repositories](#git-repositories)
- [Blobs](#blobs)
@@ -35,9 +35,9 @@ verification methods:
| Git | Project Snippets | Geo with Gitaly | Gitaly Checksum |
| Git | Personal Snippets | Geo with Gitaly | Gitaly Checksum |
| Git | Group wiki repository | Geo with Gitaly | _Not implemented_ |
-| Blobs | User uploads _(file system)_ | Geo with API | _Not implemented_ |
+| Blobs | User uploads _(file system)_ | Geo with API | SHA256 checksum |
| Blobs | User uploads _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
-| Blobs | LFS objects _(file system)_ | Geo with API | _Not implemented_ |
+| Blobs | LFS objects _(file system)_ | Geo with API | SHA256 checksum |
| Blobs | LFS objects _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
| Blobs | CI job artifacts _(file system)_ | Geo with API | _Not implemented_ |
| Blobs | CI job artifacts _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
@@ -51,11 +51,11 @@ verification methods:
| Blobs | Infrastructure registry _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
| Blobs | Versioned Terraform State _(file system)_ | Geo with API | SHA256 checksum |
| Blobs | Versioned Terraform State _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
-| Blobs | External Merge Request Diffs _(file system)_ | Geo with API | _Not implemented_ |
+| Blobs | External Merge Request Diffs _(file system)_ | Geo with API | SHA256 checksum |
| Blobs | External Merge Request Diffs _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
| Blobs | Pipeline artifacts _(file system)_ | Geo with API | SHA256 checksum |
-| Blobs | Pipeline artifacts _(object storage)_ | Geo with API/Managed (*2*) | SHA256 checksum |
-| Blobs | Pages _(file system)_ | Geo with API | _Not implemented_ |
+| Blobs | Pipeline artifacts _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
+| Blobs | Pages _(file system)_ | Geo with API | SHA256 checksum |
| Blobs | Pages _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
- (*1*): Redis replication can be used as part of HA with Redis sentinel. It's not used between Geo sites.
@@ -66,15 +66,20 @@ verification methods:
A GitLab instance can have one or more repository shards. Each shard has a Gitaly instance that
is responsible for allowing access and operations on the locally stored Git repositories. It can run
-on a machine with a single disk, multiple disks mounted as a single mount-point (like with a RAID array),
-or using LVM.
+on a machine:
-It requires no special file system and can work with NFS or a mounted Storage Appliance (there may be
-performance limitations when using a remote file system).
+- With a single disk.
+- With multiple disks mounted as a single mount-point (like with a RAID array).
+- Using LVM.
-Geo will trigger garbage collection in Gitaly to [deduplicate forked repositories](../../../development/git_object_deduplication.md#git-object-deduplication-and-gitlab-geo) on Geo secondary sites.
+GitLab does not require a special file system and can work with:
-Communication is done via Gitaly's own gRPC API. There are three possible ways of synchronization:
+- NFS.
+- A mounted Storage Appliance (there may be performance limitations when using a remote file system).
+
+Geo triggers garbage collection in Gitaly to [deduplicate forked repositories](../../../development/git_object_deduplication.md#git-object-deduplication-and-gitlab-geo) on Geo secondary sites.
+
+Communication is done via Gitaly's own gRPC API, with three possible ways of synchronization:
- Using regular Git clone/fetch from one Geo site to another (with special authentication).
- Using repository snapshots (for when the first method fails or repository is corrupt).
@@ -90,7 +95,7 @@ They all live in the same shard and share the same base name with a `-wiki` and
for Wiki and Design Repository cases.
Besides that, there are snippet repositories. They can be connected to a project or to some specific user.
-Both types will be synced to a secondary site.
+Both types are synced to a secondary site.
### Blobs
@@ -102,7 +107,7 @@ GitLab stores files and blobs such as Issue attachments or LFS objects into eith
- Hosted by you (like MinIO).
- A Storage Appliance that exposes an Object Storage-compatible API.
-When using the file system store instead of Object Storage, you need to use network mounted file systems
+When using the file system store instead of Object Storage, use network mounted file systems
to run GitLab when using more than one node.
With respect to replication and verification:
@@ -118,17 +123,17 @@ GitLab relies on data stored in multiple databases, for different use-cases.
PostgreSQL is the single point of truth for user-generated content in the Web interface, like issues content, comments
as well as permissions and credentials.
-PostgreSQL can also hold some level of cached data like HTML rendered Markdown, cached merge-requests diff (this can
-also be configured to be offloaded to object storage).
+PostgreSQL can also hold some level of cached data like HTML-rendered Markdown and cached merge-requests diff.
+This can also be configured to be offloaded to object storage.
We use PostgreSQL's own replication functionality to replicate data from the **primary** to **secondary** sites.
We use Redis both as a cache store and to hold persistent data for our background jobs system. Because both
use-cases have data that are exclusive to the same Geo site, we don't replicate it between sites.
-Elasticsearch is an optional database, that can enable advanced searching capabilities, like improved Advanced Search
-in both source-code level and user generated content in Issues / Merge-Requests and discussions. Currently it's not
-supported in Geo.
+Elasticsearch is an optional database that for advanced searching capabilities. It can improve search
+in both source-code level, and user generated content in issues, merge requests, and discussions.
+Elasticsearch is not supported in Geo.
## Limitations on replication/verification
@@ -142,7 +147,6 @@ these epics/issues:
- [Geo: Improve the self-service Geo replication framework](https://gitlab.com/groups/gitlab-org/-/epics/3761)
- [Geo: Move existing blobs to framework](https://gitlab.com/groups/gitlab-org/-/epics/3588)
- [Geo: Add unreplicated data types](https://gitlab.com/groups/gitlab-org/-/epics/893)
-- [Geo: Support GitLab Pages](https://gitlab.com/groups/gitlab-org/-/epics/589)
### Replicated data types behind a feature flag
@@ -174,35 +178,35 @@ Feature.enable(:geo_package_file_replication)
WARNING:
Features not on this list, or with **No** in the **Replicated** column,
are not replicated to a **secondary** site. Failing over without manually
-replicating data from those features will cause the data to be **lost**.
-If you wish to use those features on a **secondary** site, or to execute a failover
+replicating data from those features causes the data to be **lost**.
+To use those features on a **secondary** site, or to execute a failover
successfully, you must replicate their data using some other means.
-|Feature | Replicated (added in GitLab version) | Verified (added in GitLab version) | Object Storage replication (see [Geo with Object Storage](object_storage.md)) | Notes |
-|:--------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------|:------------------------------------------------------------------------|:------------------------------------------------------------------------------|:------|
-|[Application data in PostgreSQL](../../postgresql/index.md) | **Yes** (10.2) | **Yes** (10.2) | No | |
-|[Project repository](../../../user/project/repository/) | **Yes** (10.2) | **Yes** (10.7) | No | |
-|[Project wiki repository](../../../user/project/wiki/) | **Yes** (10.2) | **Yes** (10.7) | No | |
-|[Group wiki repository](../../../user/project/wiki/group.md) | [**Yes** (13.10)](https://gitlab.com/gitlab-org/gitlab/-/issues/208147) | No | No | Behind feature flag `geo_group_wiki_repository_replication`, enabled by default. |
-|[Uploads](../../uploads.md) | **Yes** (10.2) | [No](https://gitlab.com/groups/gitlab-org/-/epics/1817) | No | Verified only on transfer or manually using [Integrity Check Rake Task](../../raketasks/check.md) on both sites and comparing the output between them. |
-|[LFS objects](../../lfs/index.md) | **Yes** (10.2) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8922) | Via Object Storage provider if supported. Native Geo support (Beta). | Verified only on transfer or manually using [Integrity Check Rake Task](../../raketasks/check.md) on both sites and comparing the output between them. GitLab versions 11.11.x and 12.0.x are affected by [a bug that prevents any new LFS objects from replicating](https://gitlab.com/gitlab-org/gitlab/-/issues/32696).<br /><br />Behind feature flag `geo_lfs_object_replication`, enabled by default. |
-|[Personal snippets](../../../user/snippets.md) | **Yes** (10.2) | **Yes** (10.2) | No | |
-|[Project snippets](../../../user/snippets.md) | **Yes** (10.2) | **Yes** (10.2) | No | |
-|[CI job artifacts](../../../ci/pipelines/job_artifacts.md) | **Yes** (10.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8923) | Via Object Storage provider if supported. Native Geo support (Beta). | Verified only manually using [Integrity Check Rake Task](../../raketasks/check.md) on both sites and comparing the output between them. Job logs also verified on transfer. |
-|[CI Pipeline Artifacts](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/ci/pipeline_artifact.rb) | [**Yes** (13.11)](https://gitlab.com/gitlab-org/gitlab/-/issues/238464) | [**Yes** (13.11)](https://gitlab.com/gitlab-org/gitlab/-/issues/238464) | Via Object Storage provider if supported. Native Geo support (Beta). | Persists additional artifacts after a pipeline completes |
-|[Container Registry](../../packages/container_registry.md) | **Yes** (12.3) | No | No | Disabled by default. See [instructions](docker_registry.md) to enable. |
-|[Content in object storage (beta)](object_storage.md) | **Yes** (12.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/13845) | No | |
-|[Infrastructure Registry](../../../user/packages/infrastructure_registry/index.md) | **Yes** (14.0) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (14.0) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_package_file_replication`, enabled by default. |
-|[Project designs repository](../../../user/project/issues/design_management.md) | **Yes** (12.7) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/32467) | No | Designs also require replication of LFS objects and Uploads. |
-|[Package Registry](../../../user/packages/package_registry/index.md) | **Yes** (13.2) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (13.10) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_package_file_replication`, enabled by default. |
-|[Versioned Terraform State](../../terraform_state.md) | **Yes** (13.5) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (13.12) | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_terraform_state_version_replication`, enabled by default. Verification was behind the feature flag `geo_terraform_state_version_verification`, which was removed in 14.0|
-|[External merge request diffs](../../merge_request_diffs.md) | **Yes** (13.5) | No | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_merge_request_diff_replication`, enabled by default. Verification is under development, behind the feature flag `geo_merge_request_diff_verification`, introduced in 14.0.|
-|[Versioned snippets](../../../user/snippets.md#versioned-snippets) | [**Yes** (13.7)](https://gitlab.com/groups/gitlab-org/-/epics/2809) | [**Yes** (14.2)](https://gitlab.com/groups/gitlab-org/-/epics/2810) | No | Verification was implemented behind the feature flag `geo_snippet_repository_verification` in 13.11, and the feature flag was removed in 14.2. |
-|[GitLab Pages](../../pages/index.md) | [**Yes** (14.3)](https://gitlab.com/groups/gitlab-org/-/epics/589) | No | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_pages_deployment_replication`, enabled by default. |
-|[Server-side Git hooks](../../server_hooks.md) | [Not planned](https://gitlab.com/groups/gitlab-org/-/epics/1867) | No | No | Not planned because of current implementation complexity, low customer interest, and availability of alternatives to hooks. |
-|[Elasticsearch integration](../../../integration/elasticsearch.md) | [Not planned](https://gitlab.com/gitlab-org/gitlab/-/issues/1186) | No | No | Not planned because further product discovery is required and Elasticsearch (ES) clusters can be rebuilt. Secondaries currently use the same ES cluster as the primary. |
-|[Dependency proxy images](../../../user/packages/dependency_proxy/index.md) | [Not planned](https://gitlab.com/gitlab-org/gitlab/-/issues/259694) | No | No | Blocked by [Geo: Secondary Mimicry](https://gitlab.com/groups/gitlab-org/-/epics/1528). Replication of this cache is not needed for disaster recovery purposes because it can be recreated from external sources. |
-|[Vulnerability Export](../../../user/application_security/vulnerability_report/#export-vulnerability-details) | [Not planned](https://gitlab.com/groups/gitlab-org/-/epics/3111) | No | No | Not planned because they are ephemeral and sensitive information. They can be regenerated on demand. |
+|Feature | Replicated (added in GitLab version) | Verified (added in GitLab version) | Object Storage replication (see [Geo with Object Storage](object_storage.md)) | Notes |
+|:--------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------|:---------------------------------------------------------------------------|:------------------------------------------------------------------------------|:------|
+|[Application data in PostgreSQL](../../postgresql/index.md) | **Yes** (10.2) | **Yes** (10.2) | No | |
+|[Project repository](../../../user/project/repository/) | **Yes** (10.2) | **Yes** (10.7) | No | |
+|[Project wiki repository](../../../user/project/wiki/) | **Yes** (10.2) | **Yes** (10.7) | No | |
+|[Group wiki repository](../../../user/project/wiki/group.md) | [**Yes** (13.10)](https://gitlab.com/gitlab-org/gitlab/-/issues/208147) | No | No | Behind feature flag `geo_group_wiki_repository_replication`, enabled by default. |
+|[Uploads](../../uploads.md) | **Yes** (10.2) | **Yes** (14.6) | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_upload_replication`, enabled by default. Verification is behind the feature flag `geo_upload_verification` introduced and enabled by default in 14.6. |
+|[LFS objects](../../lfs/index.md) | **Yes** (10.2) | **Yes** (14.6) | Via Object Storage provider if supported. Native Geo support (Beta). | GitLab versions 11.11.x and 12.0.x are affected by [a bug that prevents any new LFS objects from replicating](https://gitlab.com/gitlab-org/gitlab/-/issues/32696).<br /><br />Replication is behind the feature flag `geo_lfs_object_replication`, enabled by default. Verification is behind the feature flag `geo_lfs_object_verification` introduced and enabled by default in 14.6. |
+|[Personal snippets](../../../user/snippets.md) | **Yes** (10.2) | **Yes** (10.2) | No | |
+|[Project snippets](../../../user/snippets.md) | **Yes** (10.2) | **Yes** (10.2) | No | |
+|[CI job artifacts](../../../ci/pipelines/job_artifacts.md) | **Yes** (10.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8923) | Via Object Storage provider if supported. Native Geo support (Beta). | Verified only manually using [Integrity Check Rake Task](../../raketasks/check.md) on both sites and comparing the output between them. Job logs also verified on transfer. |
+|[CI Pipeline Artifacts](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/ci/pipeline_artifact.rb) | [**Yes** (13.11)](https://gitlab.com/gitlab-org/gitlab/-/issues/238464) | [**Yes** (13.11)](https://gitlab.com/gitlab-org/gitlab/-/issues/238464) | Via Object Storage provider if supported. Native Geo support (Beta). | Persists additional artifacts after a pipeline completes. |
+|[Container Registry](../../packages/container_registry.md) | **Yes** (12.3) | No | No | Disabled by default. See [instructions](docker_registry.md) to enable. |
+|[Content in object storage (beta)](object_storage.md) | **Yes** (12.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/13845) | No | |
+|[Infrastructure Registry](../../../user/packages/infrastructure_registry/index.md) | **Yes** (14.0) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (14.0) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_package_file_replication`, enabled by default. |
+|[Project designs repository](../../../user/project/issues/design_management.md) | **Yes** (12.7) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/32467) | No | Designs also require replication of LFS objects and Uploads. |
+|[Package Registry](../../../user/packages/package_registry/index.md) | **Yes** (13.2) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (13.10) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_package_file_replication`, enabled by default. |
+|[Versioned Terraform State](../../terraform_state.md) | **Yes** (13.5) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (13.12) | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_terraform_state_version_replication`, enabled by default. Verification was behind the feature flag `geo_terraform_state_version_verification`, which was removed in 14.0. |
+|[External merge request diffs](../../merge_request_diffs.md) | **Yes** (13.5) | **Yes** (14.6) | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_merge_request_diff_replication`, enabled by default. Verification is behind the feature flag `geo_merge_request_diff_verification`, enabled by default in 14.6.|
+|[Versioned snippets](../../../user/snippets.md#versioned-snippets) | [**Yes** (13.7)](https://gitlab.com/groups/gitlab-org/-/epics/2809) | [**Yes** (14.2)](https://gitlab.com/groups/gitlab-org/-/epics/2810) | No | Verification was implemented behind the feature flag `geo_snippet_repository_verification` in 13.11, and the feature flag was removed in 14.2. |
+|[GitLab Pages](../../pages/index.md) | [**Yes** (14.3)](https://gitlab.com/groups/gitlab-org/-/epics/589) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (14.6) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_pages_deployment_replication`, enabled by default. Verification is behind the feature flag `geo_pages_deployment_verification`, enabled by default in 14.6. |
+|[Server-side Git hooks](../../server_hooks.md) | [Not planned](https://gitlab.com/groups/gitlab-org/-/epics/1867) | No | No | Not planned because of current implementation complexity, low customer interest, and availability of alternatives to hooks. |
+|[Elasticsearch integration](../../../integration/elasticsearch.md) | [Not planned](https://gitlab.com/gitlab-org/gitlab/-/issues/1186) | No | No | Not planned because further product discovery is required and Elasticsearch (ES) clusters can be rebuilt. Secondaries use the same ES cluster as the primary. |
+|[Dependency proxy images](../../../user/packages/dependency_proxy/index.md) | [Not planned](https://gitlab.com/gitlab-org/gitlab/-/issues/259694) | No | No | Blocked by [Geo: Secondary Mimicry](https://gitlab.com/groups/gitlab-org/-/epics/1528). Replication of this cache is not needed for disaster recovery purposes because it can be recreated from external sources. |
+|[Vulnerability Export](../../../user/application_security/vulnerability_report/#export-vulnerability-details) | [Not planned](https://gitlab.com/groups/gitlab-org/-/epics/3111) | No | No | Not planned because they are ephemeral and sensitive information. They can be regenerated on demand. |
#### Limitation of verification for files in Object Storage
diff --git a/doc/administration/geo/replication/disable_geo.md b/doc/administration/geo/replication/disable_geo.md
index 02a65f0b8e1..0fa469e57cd 100644
--- a/doc/administration/geo/replication/disable_geo.md
+++ b/doc/administration/geo/replication/disable_geo.md
@@ -20,7 +20,7 @@ To disable Geo, follow these steps:
1. [Remove the primary site from the UI](#remove-the-primary-site-from-the-ui).
1. [Remove secondary replication slots](#remove-secondary-replication-slots).
1. [Remove Geo-related configuration](#remove-geo-related-configuration).
-1. [(Optional) Revert PostgreSQL settings to use a password and listen on an IP](#optional-revert-postgresql-settings-to-use-a-password-and-listen-on-an-ip).
+1. [Optional. Revert PostgreSQL settings to use a password and listen on an IP](#optional-revert-postgresql-settings-to-use-a-password-and-listen-on-an-ip).
## Remove all secondary Geo sites
diff --git a/doc/administration/geo/replication/faq.md b/doc/administration/geo/replication/faq.md
index 70a6e506c28..e613a9b5670 100644
--- a/doc/administration/geo/replication/faq.md
+++ b/doc/administration/geo/replication/faq.md
@@ -50,6 +50,8 @@ attachments and avatars, and the whole database. This means user accounts,
issues, merge requests, groups, project data, and so on, will be available for
query.
+For more details, see the [supported Geo data types](datatypes.md).
+
## Can I `git push` to a **secondary** site?
Yes! Pushing directly to a **secondary** site (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/releases/2018/09/22/gitlab-11-3-released/) in GitLab 11.3.
diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md
index 432d042608c..673d8388af1 100644
--- a/doc/administration/geo/replication/troubleshooting.md
+++ b/doc/administration/geo/replication/troubleshooting.md
@@ -559,7 +559,7 @@ to start again from scratch, there are a few steps that can help you:
You may want to remove the `/var/opt/gitlab/git-data/repositories.old` in the future
as soon as you confirmed that you don't need it anymore, to save disk space.
-1. _(Optional)_ Rename other data folders and create new ones
+1. Optional. Rename other data folders and create new ones
WARNING:
You may still have files on the **secondary** node that have been removed from the **primary** node, but this
diff --git a/doc/administration/geo/replication/updating_the_geo_nodes.md b/doc/administration/geo/replication/updating_the_geo_nodes.md
deleted file mode 100644
index f07c8d547a4..00000000000
--- a/doc/administration/geo/replication/updating_the_geo_nodes.md
+++ /dev/null
@@ -1,9 +0,0 @@
----
-redirect_to: 'updating_the_geo_sites.md'
-remove_date: '2021-11-23'
----
-
-This file was moved to [another location](updating_the_geo_sites.md).
-
-<!-- This redirect file can be deleted after <2021-11-23>. -->
-<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
diff --git a/doc/administration/geo/replication/version_specific_updates.md b/doc/administration/geo/replication/version_specific_updates.md
index 1b22a5f0991..883e335ff94 100644
--- a/doc/administration/geo/replication/version_specific_updates.md
+++ b/doc/administration/geo/replication/version_specific_updates.md
@@ -11,6 +11,10 @@ Review this page for update instructions for your version. These steps
accompany the [general steps](updating_the_geo_sites.md#general-update-steps)
for updating Geo nodes.
+## Updating to 14.4
+
+There is [an issue in GitLab 14.4.0 through 14.4.2](../../../update/index.md#1440) that can affect Geo and other features that rely on cronjobs. We recommend upgrading to GitLab 14.4.3 or later.
+
## Updating to 14.1, 14.2, 14.3
### Multi-arch images
@@ -50,7 +54,7 @@ If you are running a version prior to 14.1 and are using Geo and multi-arch cont
### Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode
-GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) will cause Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
+GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) causes Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
## Updating to GitLab 14.0/14.1
@@ -64,7 +68,7 @@ If you are running an affected version and need to remove your Primary site, you
### Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode
-GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) will cause Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
+GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) causes Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
## Updating to GitLab 13.12
@@ -90,7 +94,7 @@ Geo::LfsObjectRegistry.where(state: 0, success: true).update_all(state: 2)
### Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode
-GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) will cause Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
+GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) causes Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
## Updating to GitLab 13.11
@@ -98,20 +102,20 @@ We found an [issue with Git clone/pull through HTTP(s)](https://gitlab.com/gitla
### Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode
-GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) will cause Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
+GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) causes Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
## Updating to GitLab 13.10
### Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode
-GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) will cause Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
+GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) causes Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
## Updating to GitLab 13.9
### Error during zero-downtime update: "cannot drop column asset_proxy_whitelist"
We've detected an issue [with a column rename](https://gitlab.com/gitlab-org/gitlab/-/issues/324160)
-that will prevent upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3 when following the zero-downtime steps. It is necessary
+that prevents upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3 when following the zero-downtime steps. It is necessary
to perform the following additional steps for the zero-downtime update:
1. Before running the final `sudo gitlab-rake db:migrate` command on the deploy node,
@@ -132,7 +136,7 @@ to perform the following additional steps for the zero-downtime update:
```
If you have already run the final `sudo gitlab-rake db:migrate` command on the deploy node and have
-encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you will
+encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you might
see the following error:
```shell
@@ -148,7 +152,7 @@ More details are available [in this issue](https://gitlab.com/gitlab-org/gitlab/
### Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode
-GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) will cause Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
+GitLab 13.9 through GitLab 14.3 are affected by a bug in which enabling [GitLab Maintenance Mode](../../maintenance_mode/index.md) causes Geo secondary site statuses to appear to stop updating and become unhealthy. For more information, see [Troubleshooting - Geo Admin Area shows 'Unhealthy' after enabling Maintenance Mode](troubleshooting.md#geo-admin-area-shows-unhealthy-after-enabling-maintenance-mode).
## Updating to GitLab 13.7
@@ -168,7 +172,7 @@ on Geo secondaries. This issue is fixed in GitLab 13.6.1 and later.
In GitLab 13.3, Geo removed the PostgreSQL [Foreign Data Wrapper](https://www.postgresql.org/docs/11/postgres-fdw.html)
dependency for the tracking database.
-The FDW server, user, and the extension will be removed during the upgrade
+The FDW server, user, and the extension is removed during the upgrade
process on each secondary node. The GitLab settings related to the FDW in the
`/etc/gitlab/gitlab.rb` have been deprecated and can be safely removed.
diff --git a/doc/administration/geo/secondary_proxy/index.md b/doc/administration/geo/secondary_proxy/index.md
index 2b8c0d1e6fa..ebd71757e91 100644
--- a/doc/administration/geo/secondary_proxy/index.md
+++ b/doc/administration/geo/secondary_proxy/index.md
@@ -7,11 +7,14 @@ type: howto
# Geo proxying for secondary sites **(PREMIUM SELF)**
-> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/5914) in GitLab 14.4 [with a flag](../../feature_flags.md) named `geo_secondary_proxy`. Disabled by default.
+> - [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/5914) in GitLab 14.4 [with a flag](../../feature_flags.md) named `geo_secondary_proxy`. Disabled by default.
+> - [Enabled by default for unified URLs](https://gitlab.com/gitlab-org/gitlab/-/issues/325732) in GitLab 14.6.
+> - [Disabled by default for different URLs](https://gitlab.com/gitlab-org/gitlab/-/issues/325732) in GitLab 14.6 [with a flag](../../feature_flags.md) named `geo_secondary_proxy_separate_urls`.
FLAG:
-On self-managed GitLab, by default this feature is not available. See below to [Set up a unified URL for Geo sites](#set-up-a-unified-url-for-geo-sites).
-The feature is not ready for production use.
+On self-managed GitLab, this feature is only available by default for Geo sites using a unified URL. See below to
+[set up a unified URL for Geo sites](#set-up-a-unified-url-for-geo-sites).
+The feature is not ready for production use with separate URLs.
Use Geo proxying to:
@@ -66,7 +69,11 @@ a single URL used by all Geo sites, including the primary.
is using the secondary proxying and set the `URL` field to the single URL.
Make sure the primary site is also using this URL.
-### Enable secondary proxying
+In Kubernetes, you can use the same domain under `global.hosts.domain` as for the primary site.
+
+## Disable Geo proxying
+
+You can disable the secondary proxying on each Geo site, separately, by following these steps with Omnibus-based packages:
1. SSH into each application node (serving user traffic directly) on your secondary Geo site
and add the following environment variable:
@@ -77,7 +84,7 @@ a single URL used by all Geo sites, including the primary.
```ruby
gitlab_workhorse['env'] = {
- "GEO_SECONDARY_PROXY" => "1"
+ "GEO_SECONDARY_PROXY" => "0"
}
```
@@ -87,11 +94,15 @@ a single URL used by all Geo sites, including the primary.
gitlab-ctl reconfigure
```
-1. SSH into one node running Rails on your primary Geo site and enable the Geo secondary proxy feature flag:
+In Kubernetes, you can use `--set gitlab.webservice.extraEnv.GEO_SECONDARY_PROXY="0"`,
+or specify the following in your values file:
- ```shell
- sudo gitlab-rails runner "Feature.enable(:geo_secondary_proxy)"
- ```
+```yaml
+gitlab:
+ webservice:
+ extraEnv:
+ GEO_SECONDARY_PROXY: "0"
+```
## Enable Geo proxying with Separate URLs
@@ -99,6 +110,36 @@ The ability to use proxying with separate URLs is still in development. You can
["Geo secondary proxying with separate URLs" epic](https://gitlab.com/groups/gitlab-org/-/epics/6865)
for progress.
+To try out this feature, enable the `geo_secondary_proxy_separate_urls` feature flag.
+SSH into one node running Rails on your primary Geo site and run:
+
+```shell
+sudo gitlab-rails runner "Feature.enable(:geo_secondary_proxy_separate_urls)"
+```
+
+In Kubernetes, you can run the same command in the toolbox pod. Refer to the
+[Kubernetes cheat sheet](../../troubleshooting/kubernetes_cheat_sheet.md#gitlab-specific-kubernetes-information)
+for details.
+
+## Limitations
+
+- When secondary proxying is used, the asynchronous Geo replication can cause unexpected issues for accelerated
+ data types that may be replicated to the Geo secondaries with a delay.
+
+ For example, we found a potential issue where
+ [replication lag introduces read-after-write inconsistencies](https://gitlab.com/gitlab-org/gitlab/-/issues/345267).
+ If the replication lag is high enough, this can result in Git reads receiving stale data when hitting a secondary.
+
+- Non-Rails requests are not proxied, so other services may need to use a separate, non-unified URL to ensure requests
+ are always sent to the primary. These services include:
+
+ - GitLab Container Registry - [can be configured to use a separate domain](../../packages/container_registry.md#configure-container-registry-under-its-own-domain).
+ - GitLab Pages - should always use a separate domain, as part of [the prerequisites for running GitLab Pages](../../pages/index.md#prerequisites).
+
+- With a unified URL, Let's Encrypt can't generate certificates unless it can reach both IPs through the same domain.
+ To use TLS certificates with Let's Encrypt, you can manually point the domain to one of the Geo sites, generate
+ the certificate, then copy it to all other sites.
+
## Features accelerated by secondary Geo sites
Most HTTP traffic sent to a secondary Geo site can be proxied to the primary Geo site. With this architecture,
diff --git a/doc/administration/geo/setup/index.md b/doc/administration/geo/setup/index.md
index 84dff69ebe7..7d365f73101 100644
--- a/doc/administration/geo/setup/index.md
+++ b/doc/administration/geo/setup/index.md
@@ -26,6 +26,7 @@ If you installed GitLab using the Omnibus packages (highly recommended):
1. [Configure GitLab](../replication/configuration.md) to set the **primary** and **secondary** site(s).
1. Optional: [Configure a secondary LDAP server](../../auth/ldap/index.md) for the **secondary** site(s). See [notes on LDAP](../index.md#ldap).
1. Follow the [Using a Geo Site](../replication/usage.md) guide.
+1. [Configure Geo secondary proxying](../secondary_proxy/index.md) to use a single, unified URL for all Geo sites. This step is recommended to accelerate most read requests while transparently proxying writes to the primary Geo site.
## Post-installation documentation
diff --git a/doc/administration/gitaly/configure_gitaly.md b/doc/administration/gitaly/configure_gitaly.md
index d0841f4e607..b31a02aae0a 100644
--- a/doc/administration/gitaly/configure_gitaly.md
+++ b/doc/administration/gitaly/configure_gitaly.md
@@ -350,6 +350,10 @@ leading to `Error creating pipeline` and `Commit not found` errors, or stale dat
As the final step, you must update Gitaly clients to switch from using local Gitaly service to use
the Gitaly servers you just configured.
+NOTE:
+GitLab requires a `default` repository storage to be configured.
+[Read more about this limitation](#gitlab-requires-a-default-repository-storage).
+
This can be risky because anything that prevents your Gitaly clients from reaching the Gitaly
servers causes all Gitaly requests to fail. For example, any sort of network, firewall, or name
resolution problems.
@@ -489,6 +493,18 @@ gitaly['key_path'] = "/etc/gitlab/ssl/key.pem"
`path` can be included only for storage shards on the local Gitaly server.
If it's excluded, default Git storage directory is used for that storage shard.
+### GitLab requires a default repository storage
+
+When adding Gitaly servers to an environment, you might want to replace the original `default` Gitaly service. However, you can't
+reconfigure the GitLab application servers to remove the `default` entry from `git_data_dirs` because GitLab requires a
+`git_data_dirs` entry called `default`. [Read more](https://gitlab.com/gitlab-org/gitlab/-/issues/36175) about this limitation.
+
+To work around the limitation:
+
+1. Define an additional storage location on the new Gitaly service and configure the additional storage to be `default`.
+1. In the [Admin Area](../repository_storage_paths.md#configure-where-new-repositories-are-stored), set `default` to a weight of zero
+ to prevent repositories being stored there.
+
### Disable Gitaly where not required (optional)
If you run Gitaly [as a remote service](#run-gitaly-on-its-own-server), consider
@@ -605,7 +621,7 @@ To configure Gitaly with TLS:
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
1. Verify Gitaly traffic is being served over TLS by
[observing the types of Gitaly connections](#observe-type-of-gitaly-connections).
-1. (Optional) Improve security by:
+1. Optional. Improve security by:
1. Disabling non-TLS connections by commenting out or deleting `gitaly['listen_addr']` in
`/etc/gitlab/gitlab.rb`.
1. Saving the file.
@@ -681,7 +697,7 @@ To configure Gitaly with TLS:
1. Save the file and [restart GitLab](../restart_gitlab.md#installations-from-source).
1. Verify Gitaly traffic is being served over TLS by
[observing the types of Gitaly connections](#observe-type-of-gitaly-connections).
-1. (Optional) Improve security by:
+1. Optional. Improve security by:
1. Disabling non-TLS connections by commenting out or deleting `listen_addr` in
`/home/git/gitaly/config.toml`.
1. Saving the file.
diff --git a/doc/administration/gitaly/index.md b/doc/administration/gitaly/index.md
index c689530e12c..f99bbf21840 100644
--- a/doc/administration/gitaly/index.md
+++ b/doc/administration/gitaly/index.md
@@ -189,8 +189,7 @@ The availability objectives for Gitaly clusters assuming a single node failure a
Writes are replicated asynchronously. Any writes that have not been replicated
to the newly promoted primary are lost.
- [Strong consistency](#strong-consistency) can be used to avoid loss in some
- circumstances.
+ [Strong consistency](#strong-consistency) prevents loss in some circumstances.
- **Recovery Time Objective (RTO):** Less than 10 seconds.
Outages are detected by a health check run by each Praefect node every
@@ -284,8 +283,7 @@ Gitaly Cluster provides the following features:
- [Replication factor](#replication-factor) of repositories for increased redundancy.
- [Automatic failover](praefect.md#automatic-failover-and-primary-election-strategies) from the
primary Gitaly node to secondary Gitaly nodes.
-- Reporting of possible [data loss](praefect.md#check-for-data-loss) if replication queue is
- non-empty.
+- Reporting of possible [data loss](recovery.md#check-for-data-loss) if replication queue isn't empty.
Follow the [Gitaly Cluster epic](https://gitlab.com/groups/gitlab-org/-/epics/1489) for improvements
including [horizontally distributing reads](https://gitlab.com/groups/gitlab-org/-/epics/2013).
@@ -323,18 +321,26 @@ You can [monitor distribution of reads](#monitor-gitaly-cluster) using Prometheu
> - In GitLab 13.3, disabled unless primary-wins voting strategy is disabled.
> - From GitLab 13.4, enabled by default.
> - From GitLab 13.5, you must use Git v2.28.0 or higher on Gitaly nodes to enable strong consistency.
-> - From GitLab 13.6, primary-wins voting strategy and `gitaly_reference_transactions_primary_wins` feature flag were removed from the source code.
+> - From GitLab 13.6, primary-wins voting strategy and the `gitaly_reference_transactions_primary_wins` feature flag was removed.
+> - From GitLab 14.0, [Gitaly Cluster only supports strong consistency](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/3575), and the `gitaly_reference_transactions` feature flag was removed.
-By default, Gitaly Cluster guarantees eventual consistency by replicating all writes to secondary
-Gitaly nodes after the write to the primary Gitaly node has happened.
+Gitaly Cluster provides strong consistency by writing changes synchronously to all healthy, up-to-date replicas. If a
+replica is outdated or unhealthy at the time of the transaction, the write is asynchronously replicated to it.
-Praefect can instead provide strong consistency by creating a transaction and writing changes to all
-Gitaly nodes at once.
+If strong consistency is unavailable, Gitaly Cluster guarantees eventual consistency. In this case. Gitaly Cluster
+replicates all writes to secondary Gitaly nodes after the write to the primary Gitaly node has occurred.
-If enabled, transactions are only available for a subset of RPCs. For more information, see the
-[strong consistency epic](https://gitlab.com/groups/gitlab-org/-/epics/1189).
+Strong consistency:
-For configuration information, see [Configure strong consistency](praefect.md#configure-strong-consistency).
+- Is the primary replication method in GitLab 14.0 and later. A subset of operations still use replication jobs
+ (eventual consistency) instead of strong consistency. Refer to the
+ [strong consistency epic](https://gitlab.com/groups/gitlab-org/-/epics/1189) for more information.
+- Must be configured in GitLab versions 13.1 to 13.12. For configuration information, refer to either:
+ - Documentation on your GitLab instance at `/help`.
+ - The [13.12 documentation](https://docs.gitlab.com/13.12/ee/administration/gitaly/praefect.html#strong-consistency).
+- Is unavailable in GitLab 13.0 and earlier.
+
+For more information on monitoring strong consistency, see the Gitaly Cluster [Prometheus metrics documentation](#monitor-gitaly-cluster).
#### Replication factor
@@ -368,6 +374,10 @@ WARNING:
Some [known database inconsistency issues](#known-issues) exist in Gitaly Cluster. We recommend you
remain on your current service for now.
+NOTE:
+GitLab requires a `default` repository storage to be configured.
+[Read more about this limitation](configure_gitaly.md#gitlab-requires-a-default-repository-storage).
+
### Migrate off Gitaly Cluster
If you have repositories stored on a Gitaly Cluster, but you'd like to migrate
@@ -513,6 +523,10 @@ To monitor [strong consistency](#strong-consistency), you can use the following
You can also monitor the [Praefect logs](../logs.md#praefect-logs).
+## Recover from failure
+
+Gitaly Cluster can [recover from certain types of failure](recovery.md).
+
## Do not bypass Gitaly
GitLab doesn't advise directly accessing Gitaly repositories stored on disk with a Git client,
diff --git a/doc/administration/gitaly/praefect.md b/doc/administration/gitaly/praefect.md
index da456131a52..d3a8662080f 100644
--- a/doc/administration/gitaly/praefect.md
+++ b/doc/administration/gitaly/praefect.md
@@ -215,6 +215,38 @@ The database used by Praefect is now configured.
If you see Praefect database errors after configuring PostgreSQL, see
[troubleshooting steps](troubleshooting.md#relation-does-not-exist-errors).
+#### Reads distribution caching
+
+Praefect performance can be improved by additionally configuring the `database_direct`
+settings:
+
+```ruby
+praefect['database_direct_host'] = POSTGRESQL_HOST
+praefect['database_direct_port'] = 5432
+
+# Use the following to override parameters of direct database connection.
+# Comment out where the parameters are the same for both connections.
+
+praefect['database_direct_user'] = 'praefect'
+praefect['database_direct_password'] = PRAEFECT_SQL_PASSWORD
+praefect['database_direct_dbname'] = 'praefect_production'
+#praefect['database_direct_sslmode'] = '...'
+#praefect['database_direct_sslcert'] = '...'
+#praefect['database_direct_sslkey'] = '...'
+#praefect['database_direct_sslrootcert'] = '...'
+```
+
+Once configured, this connection is automatically used for the
+[SQL LISTEN](https://www.postgresql.org/docs/11/sql-listen.html) feature and
+allows Praefect to receive notifications from PostgreSQL for cache invalidation.
+
+Verify this feature is working by looking for the following log entry in the Praefect
+log:
+
+```plaintext
+reads distribution caching is enabled by configuration
+```
+
#### Use PgBouncer
To reduce PostgreSQL resource consumption, we recommend setting up and configuring
@@ -223,7 +255,7 @@ this, you must point Praefect to PgBouncer by setting Praefect database paramete
```ruby
praefect['database_host'] = PGBOUNCER_HOST
-praefect['database_port'] = 6432
+praefect['database_port'] = 5432
praefect['database_user'] = 'praefect'
praefect['database_password'] = PRAEFECT_SQL_PASSWORD
praefect['database_dbname'] = 'praefect_production'
@@ -1073,31 +1105,6 @@ To get started quickly:
Congratulations! You've configured an observable fault-tolerant Praefect
cluster.
-## Configure strong consistency
-
-To enable [strong consistency](index.md#strong-consistency):
-
-- In GitLab 13.5, you must use Git v2.28.0 or higher on Gitaly nodes to enable strong consistency.
-- In GitLab 13.4 and later, the strong consistency voting strategy has been improved and enabled by default.
- Instead of requiring all nodes to agree, only the primary and half of the secondaries need to agree.
-- In GitLab 13.3, reference transactions are enabled by default with a primary-wins strategy.
- This strategy causes all transactions to succeed for the primary and thus does not ensure strong consistency.
- To enable strong consistency, disable the `:gitaly_reference_transactions_primary_wins` feature flag.
-- In GitLab 13.2, enable the `:gitaly_reference_transactions` feature flag.
-- In GitLab 13.1, enable the `:gitaly_reference_transactions` and `:gitaly_hooks_rpc`
- feature flags.
-
-Changing feature flags requires [access to the Rails console](../feature_flags.md#start-the-gitlab-rails-console).
-In the Rails console, enable or disable the flags as required. For example:
-
-```ruby
-Feature.enable(:gitaly_reference_transactions)
-Feature.disable(:gitaly_reference_transactions_primary_wins)
-```
-
-For information on monitoring strong consistency, see the
-[relevant documentation](index.md#monitor-gitaly-cluster).
-
## Configure replication factor
WARNING:
@@ -1153,8 +1160,7 @@ Praefect regularly checks the health of each Gitaly node. This is used to automa
to a newly-elected primary Gitaly node if the current primary node is found to be unhealthy.
We recommend using [repository-specific primary nodes](#repository-specific-primary-nodes). This is
-[planned to be the only available election strategy](https://gitlab.com/gitlab-org/gitaly/-/issues/3574)
-from GitLab 14.0.
+[the only available election strategy](https://gitlab.com/gitlab-org/gitaly/-/issues/3574) from GitLab 14.0.
### Repository-specific primary nodes
@@ -1268,7 +1274,7 @@ To migrate existing clusters:
### Deprecated election strategies
WARNING:
-The below election strategies are deprecated and are scheduled for removal in GitLab 14.0.
+The below election strategies are deprecated and were removed in GitLab 14.0.
Migrate to [repository-specific primary nodes](#repository-specific-primary-nodes).
- **PostgreSQL:** Enabled by default until GitLab 14.0, and equivalent to:
@@ -1287,397 +1293,3 @@ Migrate to [repository-specific primary nodes](#repository-specific-primary-node
If a sufficient number of health checks fail for the current primary Gitaly node, a new primary is
elected. **Do not use with multiple Praefect nodes!** Using with multiple Praefect nodes is
likely to result in a split brain.
-
-## Primary Node Failure
-
-Gitaly Cluster recovers from a failing primary Gitaly node by promoting a healthy secondary as the
-new primary.
-
-In GitLab 14.1 and later, Gitaly Cluster:
-
-- Elects a healthy secondary with a fully up to date copy of the repository as the new primary.
-- Repository becomes unavailable if there are no fully up to date copies of it on healthy secondaries.
-
-To minimize data loss in GitLab 13.0 to 14.0, Gitaly Cluster:
-
-- Switches repositories that are outdated on the new primary to [read-only mode](#read-only-mode).
-- Elects the secondary with the least unreplicated writes from the primary to be the new
- primary. Because there can still be some unreplicated writes,
- [data loss can occur](#check-for-data-loss).
-
-### Read-only mode
-
-> - Introduced in GitLab 13.0 as [generally available](https://about.gitlab.com/handbook/product/gitlab-the-product/#generally-available-ga).
-> - Between GitLab 13.0 and GitLab 13.2, read-only mode applied to the whole virtual storage and occurred whenever failover occurred.
-> - [In GitLab 13.3 and later](https://gitlab.com/gitlab-org/gitaly/-/issues/2862), read-only mode applies on a per-repository basis and only occurs if a new primary is out of date.
-new primary. If the failed primary contained unreplicated writes, [data loss can occur](#check-for-data-loss).
-> - Removed in GitLab 14.1. Instead, repositories [become unavailable](#unavailable-repositories).
-
-When Gitaly Cluster switches to a new primary in GitLab 13.0 to 14.0, repositories enter
-read-only mode if they are out of date. This can happen after failing over to an outdated
-secondary. Read-only mode eases data recovery efforts by preventing writes that may conflict
-with the unreplicated writes on other nodes.
-
-To enable writes again in GitLab 13.0 to 14.0, an administrator can:
-
-1. [Check](#check-for-data-loss) for data loss.
-1. Attempt to [recover](#data-recovery) missing data.
-1. Either [enable writes](#enable-writes-or-accept-data-loss) in the virtual storage or
- [accept data loss](#enable-writes-or-accept-data-loss) if necessary, depending on the version of
- GitLab.
-
-## Unavailable repositories
-
-> - From GitLab 13.0 through 14.0, repositories became read-only if they were outdated on the primary but fully up to date on a healthy secondary. `dataloss` sub-command displays read-only repositories by default through these versions.
-> - Since GitLab 14.1, Praefect contains more responsive failover logic which immediately fails over to one of the fully up to date secondaries rather than placing the repository in read-only mode. Since GitLab 14.1, the `dataloss` sub-command displays repositories which are unavailable due to having no fully up to date copies on healthy Gitaly nodes.
-
-A repository is unavailable if all of its up to date replicas are unavailable. Unavailable repositories are
-not accessible through Praefect to prevent serving stale data that may break automated tooling.
-
-### Check for data loss
-
-The Praefect `dataloss` subcommand identifies:
-
-- Copies of repositories in GitLab 13.0 to GitLab 14.0 that at are likely to be outdated.
- This can help identify potential data loss after a failover.
-- Repositories in GitLab 14.1 and later that are unavailable. This helps identify potential
- data loss and repositories which are no longer accessible because all of their up-to-date
- replicas copies are unavailable.
-
-The following parameters are available:
-
-- `-virtual-storage` that specifies which virtual storage to check. Because they might require
- an administrator to intervene, the default behavior is to display:
- - In GitLab 13.0 to 14.0, copies of read-only repositories.
- - In GitLab 14.1 and later, unavailable repositories.
-- In GitLab 14.1 and later, [`-partially-unavailable`](#unavailable-replicas-of-available-repositories)
- that specifies whether to include in the output repositories that are available but have
- some assigned copies that are not available.
-
-NOTE:
-`dataloss` is still in beta and the output format is subject to change.
-
-To check for repositories with outdated primaries or for unavailable repositories, run:
-
-```shell
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss [-virtual-storage <virtual-storage>]
-```
-
-Every configured virtual storage is checked if none is specified:
-
-```shell
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss
-```
-
-Repositories are listed in the output that have either:
-
-- An outdated copy of the repository on the primary, in GitLab 13.0 to GitLab 14.0.
-- No healthy and fully up-to-date copies available, in GitLab 14.1 and later.
-
-The following information is printed for each repository:
-
-- A repository's relative path to the storage directory identifies each repository and groups the related
- information.
-- The repository's current status is printed in parentheses next to the disk path:
- - In GitLab 13.0 to 14.0, either `(read-only)` if the repository's primary node is outdated
- and can't accept writes. Otherwise, `(writable)`.
- - In GitLab 14.1 and later, `(unavailable)` is printed next to the disk path if the
- repository is unavailable.
-- The primary field lists the repository's current primary. If the repository has no primary, the field shows
- `No Primary`.
-- The In-Sync Storages lists replicas which have replicated the latest successful write and all writes
- preceding it.
-- The Outdated Storages lists replicas which contain an outdated copy of the repository. Replicas which have no copy
- of the repository but should contain it are also listed here. The maximum number of changes the replica is missing
- is listed next to replica. It's important to notice that the outdated replicas may be fully up to date or contain
- later changes but Praefect can't guarantee it.
-
-Additional information includes:
-
-- Whether a node is assigned to host the repository is listed with each node's status.
- `assigned host` is printed next to nodes that are assigned to store the repository. The
- text is omitted if the node contains a copy of the repository but is not assigned to store
- the repository. Such copies aren't kept in sync by Praefect, but may act as replication
- sources to bring assigned copies up to date.
-- In GitLab 14.1 and later, `unhealthy` is printed next to the copies that are located
- on unhealthy Gitaly nodes.
-
-Example output:
-
-```shell
-Virtual storage: default
- Outdated repositories:
- @hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git (unavailable):
- Primary: gitaly-1
- In-Sync Storages:
- gitaly-2, assigned host, unhealthy
- Outdated Storages:
- gitaly-1 is behind by 3 changes or less, assigned host
- gitaly-3 is behind by 3 changes or less
-```
-
-A confirmation is printed out when every repository is available. For example:
-
-```shell
-Virtual storage: default
- All repositories are available!
-```
-
-#### Unavailable replicas of available repositories
-
-NOTE:
-In GitLab 14.0 and earlier, the flag is `-partially-replicated` and the output shows any repositories with assigned nodes with outdated
-copies.
-
-To also list information of repositories which are available but are unavailable from some of the assigned nodes,
-use the `-partially-unavailable` flag.
-
-A repository is available if there is a healthy, up to date replica available. Some of the assigned secondary
-replicas may be temporarily unavailable for access while they are waiting to replicate the latest changes.
-
-```shell
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss [-virtual-storage <virtual-storage>] [-partially-unavailable]
-```
-
-Example output:
-
-```shell
-Virtual storage: default
- Outdated repositories:
- @hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git:
- Primary: gitaly-1
- In-Sync Storages:
- gitaly-1, assigned host
- Outdated Storages:
- gitaly-2 is behind by 3 changes or less, assigned host
- gitaly-3 is behind by 3 changes or less
-```
-
-With the `-partially-unavailable` flag set, a confirmation is printed out if every assigned replica is fully up to
-date and healthy.
-
-For example:
-
-```shell
-Virtual storage: default
- All repositories are fully available on all assigned storages!
-```
-
-### Check repository checksums
-
-To check a project's repository checksums across on all Gitaly nodes, run the
-[replicas Rake task](../raketasks/praefect.md#replica-checksums) on the main GitLab node.
-
-### Accept data loss
-
-WARNING:
-`accept-dataloss` causes permanent data loss by overwriting other versions of the repository. Data
-[recovery efforts](#data-recovery) must be performed before using it.
-
-If it is not possible to bring one of the up to date replicas back online, you may have to accept data
-loss. When accepting data loss, Praefect marks the chosen replica of the repository as the latest version
-and replicates it to the other assigned Gitaly nodes. This process overwrites any other version of the
-repository so care must be taken.
-
-```shell
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml accept-dataloss
--virtual-storage <virtual-storage> -repository <relative-path> -authoritative-storage <storage-name>
-```
-
-### Enable writes or accept data loss
-
-WARNING:
-`accept-dataloss` causes permanent data loss by overwriting other versions of the repository.
-Data [recovery efforts](#data-recovery) must be performed before using it.
-
-Praefect provides the following subcommands to re-enable writes or accept data loss:
-
-- In GitLab 13.2 and earlier, `enable-writes` to re-enable virtual storage for writes after
- data recovery attempts:
-
- ```shell
- sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml enable-writes -virtual-storage <virtual-storage>
- ```
-
-- In GitLab 13.3 and later, if it is not possible to bring one of the up to date nodes back
- online, you may have to accept data loss:
-
- ```shell
- sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml accept-dataloss -virtual-storage <virtual-storage> -repository <relative-path> -authoritative-storage <storage-name>
- ```
-
- When accepting data loss, Praefect:
-
- 1. Marks the chosen copy of the repository as the latest version.
- 1. Replicates the copy to the other assigned Gitaly nodes.
-
- This process overwrites any other copy of the repository so care must be taken.
-
-## Data recovery
-
-If a Gitaly node fails replication jobs for any reason, it ends up hosting outdated versions of the
-affected repositories. Praefect provides tools for:
-
-- [Automatic](#automatic-reconciliation) reconciliation, for GitLab 13.4 and later.
-- [Manual](#manual-reconciliation) reconciliation, for:
- - GitLab 13.3 and earlier.
- - Repositories upgraded to GitLab 13.4 and later without entries in the `repositories` table. In
- GitLab 13.6 and later, [a migration is run](https://gitlab.com/gitlab-org/gitaly/-/issues/3033)
- when Praefect starts for these repositories.
-
-These tools reconcile the outdated repositories to bring them fully up to date again.
-
-### Automatic reconciliation
-
-> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/2717) in GitLab 13.4.
-
-Praefect automatically reconciles repositories that are not up to date. By default, this is done every
-five minutes. For each outdated repository on a healthy Gitaly node, the Praefect picks a
-random, fully up-to-date replica of the repository on another healthy Gitaly node to replicate from. A
-replication job is scheduled only if there are no other replication jobs pending for the target
-repository.
-
-The reconciliation frequency can be changed via the configuration. The value can be any valid
-[Go duration value](https://golang.org/pkg/time/#ParseDuration). Values below 0 disable the feature.
-
-Examples:
-
-```ruby
-praefect['reconciliation_scheduling_interval'] = '5m' # the default value
-```
-
-```ruby
-praefect['reconciliation_scheduling_interval'] = '30s' # reconcile every 30 seconds
-```
-
-```ruby
-praefect['reconciliation_scheduling_interval'] = '0' # disable the feature
-```
-
-### Manual reconciliation
-
-WARNING:
-The `reconcile` sub-command was removed in GitLab 14.1. Use [automatic reconciliation](#automatic-reconciliation) instead. Manual reconciliation may produce excess replication jobs and is limited in functionality. Manual reconciliation does not work when [repository-specific primary nodes](#repository-specific-primary-nodes) are
-enabled.
-
-The Praefect `reconcile` sub-command allows for the manual reconciliation between two Gitaly nodes. The
-command replicates every repository on a later version on the reference storage to the target storage.
-
-```shell
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml reconcile -virtual <virtual-storage> -reference <up-to-date-storage> -target <outdated-storage> -f
-```
-
-- Replace the placeholder `<virtual-storage>` with the virtual storage containing the Gitaly node storage to be checked.
-- Replace the placeholder `<up-to-date-storage>` with the Gitaly storage name containing up to date repositories.
-- Replace the placeholder `<outdated-storage>` with the Gitaly storage name containing outdated repositories.
-
-### Manually remove repositories
-
-> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/3767) in GitLab 14.3.
-
-The `remove-repository` Praefect sub-command removes repositories from a Gitaly Cluster. It removes
-all state associated with a given repository including:
-
-- On-disk repositories on all relevant Gitaly nodes.
-- Any database state tracked by Praefect.
-
-```shell
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml remove-repository -virtual-storage <virtual-storage> -repository <repository>
-```
-
-- `-virtual-storage` is the virtual storage the repository is located in. Virtual storages are configured in `/etc/gitlab/gitlab.rb` under `praefect['virtual_storages]` and looks like the following:
-
- ```ruby
- praefect['virtual_storages'] = {
- 'default' => {
- ...
- },
- 'storage-1' => {
- ...
- }
- }
- ```
-
- In this example, the virtual storage to specify is `default` or `storage-1`.
-
-- `-repository` is the repository's relative path in the storage [beginning with `@hashed`](../repository_storage_types.md#hashed-storage).
- For example:
-
- ```plaintext
- @hashed/f5/ca/f5ca38f748a1d6eaf726b8a42fb575c3c71f1864a8143301782de13da2d9202b.git
- ```
-
-Parts of the repository can continue to exist after running `remove-repository`. This can be because of:
-
-- A deletion error.
-- An in-flight RPC call targeting the repository.
-
-If this occurs, run `remove-repository` again.
-
-### Manually list untracked repositories
-
-> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/3926) in GitLab 14.4.
-
-The `list-untracked-repositories` Praefect sub-command lists repositories of the Gitaly Cluster that both:
-
-- Exist for at least one Gitaly storage.
-- Aren't tracked in the Praefect database.
-
-The command outputs:
-
-- Result to `STDOUT` and the command's logs.
-- Errors to `STDERR`.
-
-Each entry is a complete JSON string with a newline at the end (configurable using the
-`-delimiter` flag). For example:
-
-```plaintext
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml list-untracked-repositories
-{"virtual_storage":"default","storage":"gitaly-1","relative_path":"@hashed/ab/cd/abcd123456789012345678901234567890123456789012345678901234567890.git"}
-{"virtual_storage":"default","storage":"gitaly-1","relative_path":"@hashed/ab/cd/abcd123456789012345678901234567890123456789012345678901234567891.git"}
-```
-
-### Manually track repositories
-
-> [Introduced](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/5658) in GitLab 14.4.
-
-The `track-repository` Praefect sub-command adds repositories on disk to the Praefect database to be tracked.
-
-```shell
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml track-repository -virtual-storage <virtual-storage> -repository <repository>
-```
-
-- `-virtual-storage` is the virtual storage the repository is located in. Virtual storages are configured in `/etc/gitlab/gitlab.rb` under `praefect['virtual_storages]` and looks like the following:
-
- ```ruby
- praefect['virtual_storages'] = {
- 'default' => {
- ...
- },
- 'storage-1' => {
- ...
- }
- }
- ```
-
- In this example, the virtual storage to specify is `default` or `storage-1`.
-
-- `-repository` is the repository's relative path in the storage [beginning with `@hashed`](../repository_storage_types.md#hashed-storage).
- For example:
-
- ```plaintext
- @hashed/f5/ca/f5ca38f748a1d6eaf726b8a42fb575c3c71f1864a8143301782de13da2d9202b.git
- ```
-
-- `-authoritative-storage` is the storage we want Praefect to treat as the primary. Required if
- [per-repository replication](#configure-replication-factor) is set as the replication strategy.
-
-The command outputs:
-
-- Results to `STDOUT` and the command's logs.
-- Errors to `STDERR`.
-
-This command fails if:
-
-- The repository is already being tracked by the Praefect database.
-- The repository does not exist on disk.
diff --git a/doc/administration/gitaly/recovery.md b/doc/administration/gitaly/recovery.md
new file mode 100644
index 00000000000..e1b9a73908d
--- /dev/null
+++ b/doc/administration/gitaly/recovery.md
@@ -0,0 +1,418 @@
+---
+stage: Create
+group: Gitaly
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+type: reference
+---
+
+# Recovery options
+
+Gitaly Cluster can [recover from certain types of failure](recovery.md).
+
+## Primary Node Failure
+
+Gitaly Cluster recovers from a failing primary Gitaly node by promoting a healthy secondary as the
+new primary.
+
+In GitLab 14.1 and later, Gitaly Cluster:
+
+- Elects a healthy secondary with a fully up to date copy of the repository as the new primary.
+- Repository becomes unavailable if there are no fully up to date copies of it on healthy secondaries.
+
+To minimize data loss in GitLab 13.0 to 14.0, Gitaly Cluster:
+
+- Switches repositories that are outdated on the new primary to [read-only mode](#read-only-mode).
+- Elects the secondary with the least unreplicated writes from the primary to be the new
+ primary. Because there can still be some unreplicated writes,
+ [data loss can occur](#check-for-data-loss).
+
+### Read-only mode
+
+> - Introduced in GitLab 13.0 as [generally available](https://about.gitlab.com/handbook/product/gitlab-the-product/#generally-available-ga).
+> - Between GitLab 13.0 and GitLab 13.2, read-only mode applied to the whole virtual storage and occurred whenever failover occurred.
+> - [In GitLab 13.3 and later](https://gitlab.com/gitlab-org/gitaly/-/issues/2862), read-only mode applies on a per-repository basis and only occurs if a new primary is out of date.
+new primary. If the failed primary contained unreplicated writes, [data loss can occur](#check-for-data-loss).
+> - Removed in GitLab 14.1. Instead, repositories [become unavailable](#unavailable-repositories).
+
+When Gitaly Cluster switches to a new primary in GitLab 13.0 to 14.0, repositories enter
+read-only mode if they are out of date. This can happen after failing over to an outdated
+secondary. Read-only mode eases data recovery efforts by preventing writes that may conflict
+with the unreplicated writes on other nodes.
+
+To enable writes again in GitLab 13.0 to 14.0, an administrator can:
+
+1. [Check](#check-for-data-loss) for data loss.
+1. Attempt to [recover](#data-recovery) missing data.
+1. Either [enable writes](#enable-writes-or-accept-data-loss) in the virtual storage or
+ [accept data loss](#enable-writes-or-accept-data-loss) if necessary, depending on the version of
+ GitLab.
+
+## Unavailable repositories
+
+> - From GitLab 13.0 through 14.0, repositories became read-only if they were outdated on the primary but fully up to date on a healthy secondary. `dataloss` sub-command displays read-only repositories by default through these versions.
+> - Since GitLab 14.1, Praefect contains more responsive failover logic which immediately fails over to one of the fully up to date secondaries rather than placing the repository in read-only mode. Since GitLab 14.1, the `dataloss` sub-command displays repositories which are unavailable due to having no fully up to date copies on healthy Gitaly nodes.
+
+A repository is unavailable if all of its up to date replicas are unavailable. Unavailable repositories are
+not accessible through Praefect to prevent serving stale data that may break automated tooling.
+
+### Check for data loss
+
+The Praefect `dataloss` subcommand identifies:
+
+- Copies of repositories in GitLab 13.0 to GitLab 14.0 that at are likely to be outdated.
+ This can help identify potential data loss after a failover.
+- Repositories in GitLab 14.1 and later that are unavailable. This helps identify potential
+ data loss and repositories which are no longer accessible because all of their up-to-date
+ replicas copies are unavailable.
+
+The following parameters are available:
+
+- `-virtual-storage` that specifies which virtual storage to check. Because they might require
+ an administrator to intervene, the default behavior is to display:
+ - In GitLab 13.0 to 14.0, copies of read-only repositories.
+ - In GitLab 14.1 and later, unavailable repositories.
+- In GitLab 14.1 and later, [`-partially-unavailable`](#unavailable-replicas-of-available-repositories)
+ that specifies whether to include in the output repositories that are available but have
+ some assigned copies that are not available.
+
+NOTE:
+`dataloss` is still in beta and the output format is subject to change.
+
+To check for repositories with outdated primaries or for unavailable repositories, run:
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss [-virtual-storage <virtual-storage>]
+```
+
+Every configured virtual storage is checked if none is specified:
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss
+```
+
+Repositories are listed in the output that have either:
+
+- An outdated copy of the repository on the primary, in GitLab 13.0 to GitLab 14.0.
+- No healthy and fully up-to-date copies available, in GitLab 14.1 and later.
+
+The following information is printed for each repository:
+
+- A repository's relative path to the storage directory identifies each repository and groups the related
+ information.
+- The repository's current status is printed in parentheses next to the disk path:
+ - In GitLab 13.0 to 14.0, either `(read-only)` if the repository's primary node is outdated
+ and can't accept writes. Otherwise, `(writable)`.
+ - In GitLab 14.1 and later, `(unavailable)` is printed next to the disk path if the
+ repository is unavailable.
+- The primary field lists the repository's current primary. If the repository has no primary, the field shows
+ `No Primary`.
+- The In-Sync Storages lists replicas which have replicated the latest successful write and all writes
+ preceding it.
+- The Outdated Storages lists replicas which contain an outdated copy of the repository. Replicas which have no copy
+ of the repository but should contain it are also listed here. The maximum number of changes the replica is missing
+ is listed next to replica. It's important to notice that the outdated replicas may be fully up to date or contain
+ later changes but Praefect can't guarantee it.
+
+Additional information includes:
+
+- Whether a node is assigned to host the repository is listed with each node's status.
+ `assigned host` is printed next to nodes that are assigned to store the repository. The
+ text is omitted if the node contains a copy of the repository but is not assigned to store
+ the repository. Such copies aren't kept in sync by Praefect, but may act as replication
+ sources to bring assigned copies up to date.
+- In GitLab 14.1 and later, `unhealthy` is printed next to the copies that are located
+ on unhealthy Gitaly nodes.
+
+Example output:
+
+```shell
+Virtual storage: default
+ Outdated repositories:
+ @hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git (unavailable):
+ Primary: gitaly-1
+ In-Sync Storages:
+ gitaly-2, assigned host, unhealthy
+ Outdated Storages:
+ gitaly-1 is behind by 3 changes or less, assigned host
+ gitaly-3 is behind by 3 changes or less
+```
+
+A confirmation is printed out when every repository is available. For example:
+
+```shell
+Virtual storage: default
+ All repositories are available!
+```
+
+#### Unavailable replicas of available repositories
+
+NOTE:
+In GitLab 14.0 and earlier, the flag is `-partially-replicated` and the output shows any repositories with assigned nodes with outdated
+copies.
+
+To also list information of repositories which are available but are unavailable from some of the assigned nodes,
+use the `-partially-unavailable` flag.
+
+A repository is available if there is a healthy, up to date replica available. Some of the assigned secondary
+replicas may be temporarily unavailable for access while they are waiting to replicate the latest changes.
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss [-virtual-storage <virtual-storage>] [-partially-unavailable]
+```
+
+Example output:
+
+```shell
+Virtual storage: default
+ Outdated repositories:
+ @hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git:
+ Primary: gitaly-1
+ In-Sync Storages:
+ gitaly-1, assigned host
+ Outdated Storages:
+ gitaly-2 is behind by 3 changes or less, assigned host
+ gitaly-3 is behind by 3 changes or less
+```
+
+With the `-partially-unavailable` flag set, a confirmation is printed out if every assigned replica is fully up to
+date and healthy.
+
+For example:
+
+```shell
+Virtual storage: default
+ All repositories are fully available on all assigned storages!
+```
+
+### Check repository checksums
+
+To check a project's repository checksums across on all Gitaly nodes, run the
+[replicas Rake task](../raketasks/praefect.md#replica-checksums) on the main GitLab node.
+
+### Accept data loss
+
+WARNING:
+`accept-dataloss` causes permanent data loss by overwriting other versions of the repository. Data
+[recovery efforts](#data-recovery) must be performed before using it.
+
+If it is not possible to bring one of the up to date replicas back online, you may have to accept data
+loss. When accepting data loss, Praefect marks the chosen replica of the repository as the latest version
+and replicates it to the other assigned Gitaly nodes. This process overwrites any other version of the
+repository so care must be taken.
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml accept-dataloss
+-virtual-storage <virtual-storage> -repository <relative-path> -authoritative-storage <storage-name>
+```
+
+### Enable writes or accept data loss
+
+WARNING:
+`accept-dataloss` causes permanent data loss by overwriting other versions of the repository.
+Data [recovery efforts](#data-recovery) must be performed before using it.
+
+Praefect provides the following subcommands to re-enable writes or accept data loss:
+
+- In GitLab 13.2 and earlier, `enable-writes` to re-enable virtual storage for writes after
+ data recovery attempts:
+
+ ```shell
+ sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml enable-writes -virtual-storage <virtual-storage>
+ ```
+
+- In GitLab 13.3 and later, if it is not possible to bring one of the up to date nodes back
+ online, you may have to accept data loss:
+
+ ```shell
+ sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml accept-dataloss -virtual-storage <virtual-storage> -repository <relative-path> -authoritative-storage <storage-name>
+ ```
+
+ When accepting data loss, Praefect:
+
+ 1. Marks the chosen copy of the repository as the latest version.
+ 1. Replicates the copy to the other assigned Gitaly nodes.
+
+ This process overwrites any other copy of the repository so care must be taken.
+
+## Data recovery
+
+If a Gitaly node fails replication jobs for any reason, it ends up hosting outdated versions of the
+affected repositories. Praefect provides tools for:
+
+- [Automatic](#automatic-reconciliation) reconciliation, for GitLab 13.4 and later.
+- [Manual](#manual-reconciliation) reconciliation, for:
+ - GitLab 13.3 and earlier.
+ - Repositories upgraded to GitLab 13.4 and later without entries in the `repositories` table. In
+ GitLab 13.6 and later, [a migration is run](https://gitlab.com/gitlab-org/gitaly/-/issues/3033)
+ when Praefect starts for these repositories.
+
+These tools reconcile the outdated repositories to bring them fully up to date again.
+
+### Automatic reconciliation
+
+> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/2717) in GitLab 13.4.
+
+Praefect automatically reconciles repositories that are not up to date. By default, this is done every
+five minutes. For each outdated repository on a healthy Gitaly node, the Praefect picks a
+random, fully up-to-date replica of the repository on another healthy Gitaly node to replicate from. A
+replication job is scheduled only if there are no other replication jobs pending for the target
+repository.
+
+The reconciliation frequency can be changed via the configuration. The value can be any valid
+[Go duration value](https://pkg.go.dev/time#ParseDuration). Values below 0 disable the feature.
+
+Examples:
+
+```ruby
+praefect['reconciliation_scheduling_interval'] = '5m' # the default value
+```
+
+```ruby
+praefect['reconciliation_scheduling_interval'] = '30s' # reconcile every 30 seconds
+```
+
+```ruby
+praefect['reconciliation_scheduling_interval'] = '0' # disable the feature
+```
+
+### Manual reconciliation
+
+WARNING:
+The `reconcile` sub-command was removed in GitLab 14.1. Use [automatic reconciliation](#automatic-reconciliation) instead.
+Manual reconciliation may produce excess replication jobs and is limited in functionality. Manual reconciliation does not
+work when [repository-specific primary nodes](praefect.md#repository-specific-primary-nodes) are enabled.
+
+The Praefect `reconcile` sub-command allows for the manual reconciliation between two Gitaly nodes. The
+command replicates every repository on a later version on the reference storage to the target storage.
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml reconcile -virtual <virtual-storage> -reference <up-to-date-storage> -target <outdated-storage> -f
+```
+
+- Replace the placeholder `<virtual-storage>` with the virtual storage containing the Gitaly node storage to be checked.
+- Replace the placeholder `<up-to-date-storage>` with the Gitaly storage name containing up to date repositories.
+- Replace the placeholder `<outdated-storage>` with the Gitaly storage name containing outdated repositories.
+
+### Manually remove repositories
+
+> - [Introduced](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/3767) in GitLab 14.3.
+> - [Introduced](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/4054) in GitLab 14.6, support for dry-run mode.
+
+The `remove-repository` Praefect sub-command removes a repository from a Gitaly Cluster, and all state associated with a given repository including:
+
+- On-disk repositories on all relevant Gitaly nodes.
+- Any database state tracked by Praefect.
+
+In GitLab 14.6 and later, by default, the command operates in dry-run mode. In earlier versions, the command didn't support dry-run mode. For example:
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml remove-repository -virtual-storage <virtual-storage> -repository <repository>
+```
+
+- Replace `<virtual-storage>` with the name of the virtual storage containing the repository.
+- Replace `<repository>` with the relative path of the repository to remove.
+- In GitLab 14.6 and later, add `-apply` to run the command outside of dry-run mode and remove the repository. For example:
+
+ ```shell
+ sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml remove-repository -virtual-storage <virtual-storage> -repository <repository> -apply
+ ```
+
+- `-virtual-storage` is the virtual storage the repository is located in. Virtual storages are configured in `/etc/gitlab/gitlab.rb` under `praefect['virtual_storages]` and looks like the following:
+
+ ```ruby
+ praefect['virtual_storages'] = {
+ 'default' => {
+ ...
+ },
+ 'storage-1' => {
+ ...
+ }
+ }
+ ```
+
+ In this example, the virtual storage to specify is `default` or `storage-1`.
+
+- `-repository` is the repository's relative path in the storage [beginning with `@hashed`](../repository_storage_types.md#hashed-storage).
+ For example:
+
+ ```plaintext
+ @hashed/f5/ca/f5ca38f748a1d6eaf726b8a42fb575c3c71f1864a8143301782de13da2d9202b.git
+ ```
+
+Parts of the repository can continue to exist after running `remove-repository`. This can be because of:
+
+- A deletion error.
+- An in-flight RPC call targeting the repository.
+
+If this occurs, run `remove-repository` again.
+
+### Manually list untracked repositories
+
+> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/3926) in GitLab 14.4.
+
+The `list-untracked-repositories` Praefect sub-command lists repositories of the Gitaly Cluster that both:
+
+- Exist for at least one Gitaly storage.
+- Aren't tracked in the Praefect database.
+
+The command outputs:
+
+- Result to `STDOUT` and the command's logs.
+- Errors to `STDERR`.
+
+Each entry is a complete JSON string with a newline at the end (configurable using the
+`-delimiter` flag). For example:
+
+```plaintext
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml list-untracked-repositories
+{"virtual_storage":"default","storage":"gitaly-1","relative_path":"@hashed/ab/cd/abcd123456789012345678901234567890123456789012345678901234567890.git"}
+{"virtual_storage":"default","storage":"gitaly-1","relative_path":"@hashed/ab/cd/abcd123456789012345678901234567890123456789012345678901234567891.git"}
+```
+
+### Manually track repositories
+
+> - [Introduced](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/5658) in GitLab 14.4.
+> - [Introduced](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/5789) in GitLab 14.6, support for immediate replication.
+
+The `track-repository` Praefect sub-command adds repositories on disk to the Praefect database to be tracked.
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml track-repository -virtual-storage <virtual-storage> -repository <repository> -replicate-immediately
+```
+
+- `-virtual-storage` is the virtual storage the repository is located in. Virtual storages are configured in `/etc/gitlab/gitlab.rb` under `praefect['virtual_storages]` and looks like the following:
+
+ ```ruby
+ praefect['virtual_storages'] = {
+ 'default' => {
+ ...
+ },
+ 'storage-1' => {
+ ...
+ }
+ }
+ ```
+
+ In this example, the virtual storage to specify is `default` or `storage-1`.
+
+- `-repository` is the repository's relative path in the storage [beginning with `@hashed`](../repository_storage_types.md#hashed-storage).
+ For example:
+
+ ```plaintext
+ @hashed/f5/ca/f5ca38f748a1d6eaf726b8a42fb575c3c71f1864a8143301782de13da2d9202b.git
+ ```
+
+- `-authoritative-storage` is the storage we want Praefect to treat as the primary. Required if
+ [per-repository replication](praefect.md#configure-replication-factor) is set as the replication strategy.
+- `-replicate-immediately`, available in GitLab 14.6 and later, causes the command to replicate the repository to its secondaries immediately.
+ Otherwise, replication jobs are scheduled for execution in the database and are picked up by a Praefect background process.
+
+The command outputs:
+
+- Results to `STDOUT` and the command's logs.
+- Errors to `STDERR`.
+
+This command fails if:
+
+- The repository is already being tracked by the Praefect database.
+- The repository does not exist on disk.
diff --git a/doc/administration/gitaly/troubleshooting.md b/doc/administration/gitaly/troubleshooting.md
index d6d93b8af94..fdd281c1a90 100644
--- a/doc/administration/gitaly/troubleshooting.md
+++ b/doc/administration/gitaly/troubleshooting.md
@@ -153,7 +153,7 @@ Confirm the following are all true:
- When any user adds or modifies a file from the repository using the GitLab
UI, it immediately fails with a red `401 Unauthorized` banner.
-- Creating a new project and [initializing it with a README](../../user/project/working_with_projects.md#blank-projects)
+- Creating a new project and [initializing it with a README](../../user/project/working_with_projects.md#create-a-blank-project)
successfully creates the project but doesn't create the README.
- When [tailing the logs](https://docs.gitlab.com/omnibus/settings/logs.html#tail-logs-in-a-console-on-the-server)
on a Gitaly client and reproducing the error, you get `401` errors
@@ -328,10 +328,94 @@ experiencing [clock drift](https://en.wikipedia.org/wiki/Clock_drift).
Please ensure that the GitLab and Gitaly nodes are synchronized and use an NTP time
server to keep them synchronized if possible.
+### Health check warnings
+
+The following warning in `/var/log/gitlab/praefect/current` can be ignored.
+
+```plaintext
+"error":"full method name not found: /grpc.health.v1.Health/Check",
+"msg":"error when looking up method info"
+```
+
+### File not found errors
+
+The following errors in `/var/log/gitlab/gitaly/current` can be ignored.
+They are caused by the GitLab Rails application checking for specific files
+that do not exist in a repository.
+
+```plaintext
+"error":"not found: .gitlab/route-map.yml"
+"error":"not found: Dockerfile"
+"error":"not found: .gitlab-ci.yml"
+```
+
## Troubleshoot Praefect (Gitaly Cluster)
The following sections provide possible solutions to Gitaly Cluster errors.
+### Check cluster health
+
+> [Introduced](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/) in GitLab 14.6.
+
+The `check` Praefect sub-command runs a series of checks to determine the health of the Gitaly Cluster.
+
+```shell
+gitlab-ctl praefect check
+```
+
+The following sections describe the checks that are run.
+
+#### Praefect migrations
+
+Because Database migrations must be up to date for Praefect to work correctly, checks if Praefect migrations are up to date.
+
+If this check fails:
+
+1. See the `schema_migrations` table in the database to see which migrations have run.
+1. Run `praefect sql-migrate` to bring the migrations up to date.
+
+#### Node connectivity and disk access
+
+Checks if Praefect can reach all of its Gitaly nodes, and if each Gitaly node has read and write access to all of its storages.
+
+If this check fails:
+
+1. Confirm the network addresses and tokens are set up correctly:
+ - In the Praefect configuration.
+ - In each Gitaly node's configuration.
+1. On the Gitaly nodes, check that the `gitaly` process being run as `git`. There might be a permissions issue that is preventing Gitaly from
+ accessing its storage directories.
+1. Confirm that there are no issues with the network that connects Praefect to Gitaly nodes.
+
+#### Database read and write access
+
+Checks if Praefect can read from and write to the database.
+
+If this check fails:
+
+1. See if the Praefect database is in recovery mode. In recovery mode, tables may be read only. To check, run:
+
+ ```sql
+ select pg_is_in_recovery()
+ ```
+
+1. Confirm that the user that Praefect uses to connect to PostgreSQL has read and write access to the database.
+1. See if the database has been placed into read-only mode. To check, run:
+
+ ```sql
+ show default_transaction_read_only
+ ```
+
+#### Inaccessible repositories
+
+Checks how many repositories are inaccessible because they are missing a primary assignment, or their primary is unavailable.
+
+If this check fails:
+
+1. See if any Gitaly nodes are down. Run `praefect ping-nodes` to check.
+1. Check if there is a high load on the Praefect database. If the Praefect database is slow to respond, it can lead health checks failing to persist
+ to the database, leading Praefect to think nodes are unhealthy.
+
### Praefect errors in logs
If you receive an error, check `/var/log/gitlab/gitlab-rails/production.log`.
@@ -353,17 +437,107 @@ Here are common errors and potential causes:
### Determine primary Gitaly node
-To determine the current primary Gitaly node for a specific Praefect node:
+To determine the primary node of a repository:
-- Use the `Shard Primary Election` [Grafana chart](praefect.md#grafana) on the
- [`Gitlab Omnibus - Praefect` dashboard](https://gitlab.com/gitlab-org/grafana-dashboards/-/blob/master/omnibus/praefect.json).
- This is recommended.
-- If you do not have Grafana set up, use the following command on each host of each
- Praefect node:
+- In GitLab 14.6 and later, use the [`praefect metadata`](#view-repository-metadata) subcommand.
+- In GitLab 13.12 to GitLab 14.5 with [repository-specific primaries](praefect.md#repository-specific-primary-nodes),
+ use the [`gitlab:praefect:replicas` Rake task](../raketasks/praefect.md#replica-checksums).
+- With legacy election strategies in GitLab 13.12 and earlier, the primary was the same for all repositories in a virtual storage.
+ To determine the current primary Gitaly node for a specific virtual storage:
- ```shell
- curl localhost:9652/metrics | grep gitaly_praefect_primaries`
- ```
+ - Use the `Shard Primary Election` [Grafana chart](praefect.md#grafana) on the
+ [`Gitlab Omnibus - Praefect` dashboard](https://gitlab.com/gitlab-org/grafana-dashboards/-/blob/master/omnibus/praefect.json).
+ This is recommended.
+ - If you do not have Grafana set up, use the following command on each host of each
+ Praefect node:
+
+ ```shell
+ curl localhost:9652/metrics | grep gitaly_praefect_primaries`
+ ```
+
+### View repository metadata
+
+> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/3481) in GitLab 14.6.
+
+Gitaly Cluster maintains a [metadata database](index.md#components) about the repositories stored on the cluster. Use the `praefect metadata` subcommand
+to inspect the metadata for troubleshooting.
+
+You can retrieve a repository's metadata by its Praefect-assigned repository ID:
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml metadata -repository-id <repository-id>
+```
+
+You can also retrieve a repository's metadata by its virtual storage and relative path:
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml metadata -virtual-storage <virtual-storage> -relative-path <relative-path>
+```
+
+#### Examples
+
+To retrieve the metadata for a repository with a Praefect-assigned repository ID of 1:
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml metadata -repository-id 1
+```
+
+To retrieve the metadata for a repository with virtual storage `default` and relative path `@hashed/b1/7e/b17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9.git`:
+
+```shell
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml metadata -virtual-storage default -relative-path @hashed/b1/7e/b17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9.git
+```
+
+Either of these examples retrieve the following metadata for an example repository:
+
+```plaintext
+Repository ID: 54771
+Virtual Storage: "default"
+Relative Path: "@hashed/b1/7e/b17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9.git"
+Replica Path: "@hashed/b1/7e/b17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9.git"
+Primary: "gitaly-1"
+Generation: 1
+Replicas:
+- Storage: "gitaly-1"
+ Assigned: true
+ Generation: 1, fully up to date
+ Healthy: true
+ Valid Primary: true
+- Storage: "gitaly-2"
+ Assigned: true
+ Generation: 0, behind by 1 changes
+ Healthy: true
+ Valid Primary: false
+- Storage: "gitaly-3"
+ Assigned: true
+ Generation: replica not yet created
+ Healthy: false
+ Valid Primary: false
+```
+
+#### Available metadata
+
+The metadata retrieved by `praefect metadata` includes the fields in the following tables.
+
+| Field | Description |
+|:------------------|:-------------------------------------------------------------------------------------------------------------------|
+| `Repository ID` | Permanent unique ID assigned to the repository by Praefect. Different to the ID GitLab uses for repositories. |
+| `Virtual Storage` | Name of the virtual storage the repository is stored in. |
+| `Relative Path` | Repository's path in the virtual storage. |
+| `Replica Path` | Where on the Gitaly node's disk the repository's replicas are stored. |
+| `Primary` | Current primary of the repository. |
+| `Generation` | Used by Praefect to track repository changes. Each write in the repository increments the repository's generation. |
+| `Replicas` | A list of replicas that exist or are expected to exist. |
+
+For each replica, the following metadata is available:
+
+| `Replicas` Field | Description |
+|:-----------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Storage` | Name of the Gitaly storage that contains the replica. |
+| `Assigned` | Indicates whether the replica is expected to exist in the storage. Can be `false` if a Gitaly node is removed from the cluster or if the storage contains an extra copy after the repository's replication factor was decreased. |
+| `Generation` | Latest confirmed generation of the replica. It indicates:<br><br>- The replica is fully up to date if the generation matches the repository's generation.<br>- The replica is outdated if the replica's generation is less than the repository's generation.<br>- `replica not yet created` if the replica does not yet exist at all on the storage. |
+| `Healthy` | Indicates whether the Gitaly node that is hosting this replica is considered healthy by the consensus of Praefect nodes. |
+| `Valid Primary` | Indicates whether the replica is fit to serve as the primary node. If the repository's primary is not a valid primary, a failover occurs on the next write to the repository if there is another replica that is a valid primary. A replica is a valid primary if:<br><br>- It is stored on a healthy Gitaly node.<br>- It is fully up to date.<br>- It is not targeted by a pending deletion job from decreasing replication factor.<br>- It is assigned. |
### Check that repositories are in sync
@@ -371,7 +545,7 @@ Is [some cases](index.md#known-issues) the Praefect database can get out of sync
a given repository is fully synced on all nodes, run the [`gitlab:praefect:replicas` Rake task](../raketasks/praefect.md#replica-checksums)
that checksums the repository on all Gitaly nodes.
-The [Praefect dataloss](praefect.md#check-for-data-loss) command only checks the state of the repo in the Praefect database, and cannot
+The [Praefect dataloss](recovery.md#check-for-data-loss) command only checks the state of the repo in the Praefect database, and cannot
be relied to detect sync problems in this scenario.
### Relation does not exist errors
@@ -409,3 +583,21 @@ This indicates that the virtual storage name used in the
[`git_data_dirs` setting](praefect.md#gitaly) for GitLab.
Resolve this by matching the virtual storage names used in Praefect and GitLab configuration.
+
+### Gitaly Cluster performance issues on cloud platforms
+
+Praefect does not require a lot of CPU or memory, and can run on small virtual machines.
+Cloud services may place other limits on the resources that small VMs can use, such as
+disk IO and network traffic.
+
+Praefect nodes generate a lot of network traffic. The following symptoms can be observed if their network bandwidth has
+been throttled by the cloud service:
+
+- Poor performance of Git operations.
+- High network latency.
+- High memory use by Praefect.
+
+Possible solutions:
+
+- Provision larger VMs to gain access to larger network traffic allowances.
+- Use your cloud service's monitoring and logging to check that the Praefect nodes are not exhausting their traffic allowances.
diff --git a/doc/administration/img/db_load_balancing_postgres_stats.png b/doc/administration/img/db_load_balancing_postgres_stats.png
deleted file mode 100644
index 8b311616e7b..00000000000
--- a/doc/administration/img/db_load_balancing_postgres_stats.png
+++ /dev/null
Binary files differ
diff --git a/doc/administration/incoming_email.md b/doc/administration/incoming_email.md
index 6b390cfc77a..3f54f5dd576 100644
--- a/doc/administration/incoming_email.md
+++ b/doc/administration/incoming_email.md
@@ -10,7 +10,7 @@ GitLab has several features based on receiving incoming email messages:
- [Reply by Email](reply_by_email.md): allow GitLab users to comment on issues
and merge requests by replying to notification email.
-- [New issue by email](../user/project/issues/managing_issues.md#new-issue-via-email):
+- [New issue by email](../user/project/issues/managing_issues.md#by-sending-an-email):
allow GitLab users to create a new issue by sending an email to a
user-specific email address.
- [New merge request by email](../user/project/merge_requests/creating_merge_requests.md#by-sending-an-email):
@@ -66,6 +66,24 @@ This solution is relatively simple to set up: you just need to create an email
address dedicated to receive your users' replies to GitLab notifications. However,
this method only supports replies, and not the other features of [incoming email](#incoming-email).
+## Accepted headers
+
+Email is processed correctly when a configured email address is present in one of the following headers:
+
+- `To`
+- `Delivered-To`
+- `Envelope-To` or `X-Envelope-To`
+
+In GitLab 14.6 and later, [Service Desk](../user/project/service_desk.md)
+also checks these additional headers.
+
+Usually, the "To" field contains the email address of the primary receiver.
+However, it might not include the configured GitLab email address if:
+
+- The address is in the "CC" field.
+- The address was included when using "Reply all".
+- The email was forwarded.
+
## Set it up
If you want to use Gmail / Google Apps for incoming email, make sure you have
diff --git a/doc/administration/index.md b/doc/administration/index.md
index 53a3c305aab..d78c9d80b5f 100644
--- a/doc/administration/index.md
+++ b/doc/administration/index.md
@@ -31,8 +31,6 @@ Learn how to install, configure, update, and maintain your GitLab instance.
### Installing GitLab
- [Install](../install/index.md): Requirements, directory structures, and installation methods.
- - [Database load balancing](database_load_balancing.md): Distribute database queries among multiple database servers.
- - [Omnibus support for log forwarding](https://docs.gitlab.com/omnibus/settings/logs.html#udp-log-shipping-gitlab-enterprise-edition-only).
- [Reference architectures](reference_architectures/index.md): Add additional resources to support more users.
- [Installing GitLab on Amazon Web Services (AWS)](../install/aws/index.md): Set up GitLab on Amazon AWS.
- [Geo](geo/index.md): Replicate your GitLab instance to other geographic locations as a read-only fully operational version.
@@ -79,6 +77,8 @@ Learn how to install, configure, update, and maintain your GitLab instance.
- [Enabling and disabling features flags](feature_flags.md): how to enable and
disable GitLab features deployed behind feature flags.
- [Application settings cache expiry interval](application_settings_cache.md)
+- [Database Load Balancing](postgresql/database_load_balancing.md): Distribute database queries among multiple database servers.
+- [Omnibus support for log forwarding](https://docs.gitlab.com/omnibus/settings/logs.html#udp-log-shipping-gitlab-enterprise-edition-only).
#### Customizing GitLab appearance
@@ -133,7 +133,7 @@ Learn how to install, configure, update, and maintain your GitLab instance.
- Instances.
- [Auditor users](auditor_users.md): Users with read-only access to all projects, groups, and other resources on the GitLab instance.
- [Incoming email](incoming_email.md): Configure incoming emails to allow
- users to [reply by email](reply_by_email.md), create [issues by email](../user/project/issues/managing_issues.md#new-issue-via-email) and
+ users to [reply by email](reply_by_email.md), create [issues by email](../user/project/issues/managing_issues.md#by-sending-an-email) and
[merge requests by email](../user/project/merge_requests/creating_merge_requests.md#by-sending-an-email), and to enable [Service Desk](../user/project/service_desk.md).
- [Postfix for incoming email](reply_by_email_postfix_setup.md): Set up a
basic Postfix mail server with IMAP authentication on Ubuntu for incoming
diff --git a/doc/administration/instance_limits.md b/doc/administration/instance_limits.md
index 0b470146b14..bfe59d5277b 100644
--- a/doc/administration/instance_limits.md
+++ b/doc/administration/instance_limits.md
@@ -88,12 +88,8 @@ requests per user. For more information, read
### Files API
-> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/68561) in GitLab 14.3.
-
-FLAG:
-On self-managed GitLab, by default this feature is not available. To make it available,
-ask an administrator to [enable the `files_api_throttling` flag](../administration/feature_flags.md). On GitLab.com, this feature is available but can be configured by GitLab.com administrators only.
-The feature is not ready for production use.
+> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/68561) in GitLab 14.3 [with a flag](../administration/feature_flags.md) named `files_api_throttling`. Disabled by default.
+> - [Generally available](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/75918) in GitLab 14.6. [Feature flag `files_api_throttling`](https://gitlab.com/gitlab-org/gitlab/-/issues/338903) removed.
This setting limits the request rate on the Packages API per user or IP address. For more information, read
[Files API rate limits](../user/admin_area/settings/files_api_rate_limits.md).
@@ -257,7 +253,7 @@ Set the limit to `0` to disable it.
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/237891) in GitLab 13.7.
The [minimum wait time between pull refreshes](../user/project/repository/mirror/index.md)
-defaults to 300 seconds (5 minutes). For example, by default a pull refresh will only run once in a given 300 second period regardless of how many times you try to trigger it.
+defaults to 300 seconds (5 minutes). For example, a pull refresh only runs once in a given 300 second period, regardless of how many times you trigger it.
This setting applies in the context of pull refreshes invoked via the [projects API](../api/projects.md#start-the-pull-mirroring-process-for-a-project), or when forcing an update by selecting the **Update now** (**{retry}**) button within **Settings > Repository > Mirroring repositories**. This setting has no effect on the automatic 30 minute interval schedule used by Sidekiq for [pull mirroring](../user/project/repository/mirror/pull.md).
@@ -400,7 +396,7 @@ limit is checked every time a new trigger is created.
If a new trigger would cause the total number of pipeline triggers to exceed the
limit, the trigger is considered invalid.
-Set the limit to `0` to disable it. Defaults to `0` on self-managed instances.
+Set the limit to `0` to disable it. Defaults to `150` on self-managed instances.
To set this limit to `100` on a self-managed installation, run the following in the
[GitLab Rails console](operations/rails_console.md#starting-a-rails-console-session):
@@ -551,8 +547,8 @@ Plan.default.actual_limits.update!(pages_file_entries: 100)
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/321368) in GitLab 13.12. Disabled by default.
> - Enabled on GitLab.com in GitLab 14.3.
> - Enabled on self-managed in GitLab 14.4.
-> - Feature flag `ci_runner_limits` removed in GitLab 14.4. You can still use `ci_runner_limits_override`
- to remove limits for a given scope.
+> - Feature flag `ci_runner_limits` removed in GitLab 14.4.
+> - Feature flag `ci_runner_limits_override` removed in GitLab 14.6.
The total number of registered runners is limited at the group and project levels. Each time a new runner is registered,
GitLab checks these limits against runners that have been active in the last 3 months.
@@ -739,7 +735,7 @@ See [Environment Dashboard](../ci/environments/environments_dashboard.md#adding-
[Deploy boards](../user/project/deploy_boards.md) load information from Kubernetes about
Pods and Deployments. However, data over 10 MB for a certain environment read from
-Kubernetes won't be shown.
+Kubernetes aren't shown.
## Merge requests
@@ -762,7 +758,7 @@ prevent any more changes from rendering. For more information about these limits
### Merge request reports size limit
-Reports that go over the 20 MB limit won't be loaded. Affected reports:
+Reports that go over the 20 MB limit aren't loaded. Affected reports:
- [Merge request security reports](../user/project/merge_requests/testing_and_reports_in_merge_requests.md#security-reports)
- [CI/CD parameter `artifacts:expose_as`](../ci/yaml/index.md#artifactsexpose_as)
@@ -826,7 +822,7 @@ See the [Design Management Limitations](../user/project/issues/design_management
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/31009) in GitLab 12.4.
Total number of changes (branches or tags) in a single push. If changes are more
-than the specified limit, hooks won't be executed.
+than the specified limit, hooks are not executed.
More information can be found in these docs:
@@ -848,16 +844,21 @@ More information can be found in the [Push event activities limit and bulk push
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/218017) in GitLab 13.4.
-On GitLab.com, the maximum file size for a package that's uploaded to the [GitLab Package Registry](../user/packages/package_registry/index.md) varies by format:
+The default maximum file size for a package that's uploaded to the [GitLab Package Registry](../user/packages/package_registry/index.md) varies by format:
-- Conan: 5 GB
+- Conan: 3 GB
- Generic: 5 GB
-- Maven: 5 GB
-- npm: 5 GB
-- NuGet: 5 GB
-- PyPI: 5 GB
+- Helm: 5 MB
+- Maven: 3 GB
+- npm: 500 MB
+- NuGet: 500 MB
+- PyPI: 3 GB
+- Terraform: 1 GB
-To set this limit for a self-managed installation, run the following in the
+The [maximum file sizes on GitLab.com](../user/gitlab_com/index.md#package-registry-limits)
+might be different.
+
+To set these limits for a self-managed installation, run the following in the
[GitLab Rails console](operations/rails_console.md#starting-a-rails-console-session):
```ruby
@@ -881,6 +882,9 @@ Plan.default.actual_limits.update!(pypi_max_file_size: 100.megabytes)
# For Debian Packages
Plan.default.actual_limits.update!(debian_max_file_size: 100.megabytes)
+# For Helm Charts
+Plan.default.actual_limits.update!(helm_max_file_size: 100.megabytes)
+
# For Generic Packages
Plan.default.actual_limits.update!(generic_packages_max_file_size: 100.megabytes)
```
diff --git a/doc/administration/instance_review.md b/doc/administration/instance_review.md
index 62897651166..872cdb239bd 100644
--- a/doc/administration/instance_review.md
+++ b/doc/administration/instance_review.md
@@ -12,7 +12,7 @@ If you run a medium-sized self-managed instance (50+ users) of a free version of
[either Community Edition or unlicensed Enterprise Edition](https://about.gitlab.com/install/ce-or-ee/),
you qualify for a free Instance Review.
-1. Sign in as a user with Administrator [role](../user/permissions.md).
+1. Sign in as an administrator.
1. In the top menu, click your user icon, and select
**Get a free instance review**:
diff --git a/doc/administration/integration/terminal.md b/doc/administration/integration/terminal.md
index 07b9ba87d8e..f570c9b559f 100644
--- a/doc/administration/integration/terminal.md
+++ b/doc/administration/integration/terminal.md
@@ -17,7 +17,7 @@ GitLab uses these credentials to provide access to
[web terminals](../../ci/environments/index.md#web-terminals-deprecated) for environments.
NOTE:
-Only project maintainers and owners can access web terminals.
+Only users with at least the [Maintainer role](../../user/permissions.md) for the project access web terminals.
## How it works
diff --git a/doc/administration/job_artifacts.md b/doc/administration/job_artifacts.md
index 46a9ee11679..64b5ddbd165 100644
--- a/doc/administration/job_artifacts.md
+++ b/doc/administration/job_artifacts.md
@@ -2,20 +2,18 @@
stage: Verify
group: Testing
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
-type: reference, howto
---
# Jobs artifacts administration **(FREE SELF)**
This is the administration documentation. For the user guide see [pipelines/job_artifacts](../ci/pipelines/job_artifacts.md).
-Artifacts is a list of files and directories which are attached to a job after it
-finishes. This feature is enabled by default in all GitLab installations. Keep reading
-if you want to know how to disable it.
+An artifact is a list of files and directories attached to a job after it
+finishes. This feature is enabled by default in all GitLab installations.
## Disabling job artifacts
-To disable artifacts site-wide, follow the steps below.
+To disable artifacts site-wide:
**In Omnibus installations:**
@@ -41,7 +39,7 @@ To disable artifacts site-wide, follow the steps below.
## Storing job artifacts
GitLab Runner can upload an archive containing the job artifacts to GitLab. By default,
-this is done when the job succeeds, but can also be done on failure, or always, via the
+this is done when the job succeeds, but can also be done on failure, or always, with the
[`artifacts:when`](../ci/yaml/index.md#artifactswhen) parameter.
Most artifacts are compressed by GitLab Runner before being sent to the coordinator. The exception to this is
@@ -84,8 +82,6 @@ _The artifacts are stored by default in
### Using object storage
-> Introduced in GitLab 11.0: Support for `direct_upload` to S3.
-
If you don't want to use the local disk where GitLab is installed to store the
artifacts, you can use an object storage like AWS S3 instead.
This configuration relies on valid AWS credentials to be configured already.
@@ -108,7 +104,9 @@ In GitLab 13.2 and later, we recommend using the
[consolidated object storage settings](object_storage.md#consolidated-object-storage-configuration).
This section describes the earlier configuration format.
-For source installations the following settings are nested under `artifacts:` and then `object_store:`. On Omnibus GitLab installs they are prefixed by `artifacts_object_store_`.
+For source installations the following settings are nested under `artifacts:`
+and then `object_store:`. On Omnibus GitLab installs they are prefixed by
+`artifacts_object_store_`.
| Setting | Default | Description |
|---------------------|---------|-------------|
@@ -143,8 +141,9 @@ _The artifacts are stored by default in
}
```
- NOTE: For GitLab 9.4+, if you're using AWS IAM profiles, be sure to omit the
- AWS access key and secret access key/value pairs. For example:
+ NOTE:
+ If you're using AWS IAM profiles, omit the AWS access key and secret access
+ key/value pairs. For example:
```ruby
gitlab_rails['artifacts_object_store_connection'] = {
@@ -155,37 +154,7 @@ _The artifacts are stored by default in
```
1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-1. Migrate any existing local artifacts to the object storage:
-
- ```shell
- gitlab-rake gitlab:artifacts:migrate
- ```
-
-1. Optional: Verify all files migrated properly.
- From [PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database)
- (`sudo gitlab-psql -d gitlabhq_production`) verify `objectstg` below (where `file_store=2`) has count of all artifacts:
-
- ```shell
- gitlabhq_production=# SELECT count(*) AS total, sum(case when file_store = '1' then 1 else 0 end) AS filesystem, sum(case when file_store = '2' then 1 else 0 end) AS objectstg FROM ci_job_artifacts;
-
- total | filesystem | objectstg
- ------+------------+-----------
- 2409 | 0 | 2409
- ```
-
- Verify no files on disk in `artifacts` folder:
-
- ```shell
- sudo find /var/opt/gitlab/gitlab-rails/shared/artifacts -type f | grep -v tmp/cache | wc -l
- ```
-
- In some cases, you may need to run the [orphan artifact file cleanup Rake task](../raketasks/cleanup.md#remove-orphan-artifact-files)
- to clean up orphaned artifacts.
-
-WARNING:
-JUnit test report artifact (`junit.xml.gz`) migration
-[was not supported until GitLab 12.8](https://gitlab.com/gitlab-org/gitlab/-/issues/27698#note_317190991)
-by the `gitlab:artifacts:migrate` script.
+1. [Migrate any existing local artifacts to the object storage](#migrating-to-object-storage).
**In installations from source:**
@@ -209,36 +178,7 @@ _The artifacts are stored by default in
```
1. Save the file and [restart GitLab](restart_gitlab.md#installations-from-source) for the changes to take effect.
-1. Migrate any existing local artifacts to the object storage:
-
- ```shell
- sudo -u git -H bundle exec rake gitlab:artifacts:migrate RAILS_ENV=production
- ```
-
-1. Optional: Verify all files migrated properly.
- From PostgreSQL console (`sudo -u git -H psql -d gitlabhq_production`) verify `objectstg` below (where `file_store=2`) has count of all artifacts:
-
- ```shell
- gitlabhq_production=# SELECT count(*) AS total, sum(case when file_store = '1' then 1 else 0 end) AS filesystem, sum(case when file_store = '2' then 1 else 0 end) AS objectstg FROM ci_job_artifacts;
-
- total | filesystem | objectstg
- ------+------------+-----------
- 2409 | 0 | 2409
- ```
-
- Verify no files on disk in `artifacts` folder:
-
- ```shell
- sudo find /var/opt/gitlab/gitlab-rails/shared/artifacts -type f | grep -v tmp/cache | wc -l
- ```
-
- In some cases, you may need to run the [orphan artifact file cleanup Rake task](../raketasks/cleanup.md#remove-orphan-artifact-files)
- to clean up orphaned artifacts.
-
-WARNING:
-JUnit test report artifact (`junit.xml.gz`) migration
-[was not supported until GitLab 12.8](https://gitlab.com/gitlab-org/gitlab/-/issues/27698#note_317190991)
-by the `gitlab:artifacts:migrate` script.
+1. [Migrate any existing local artifacts to the object storage](#migrating-to-object-storage).
### OpenStack example
@@ -268,11 +208,7 @@ _The uploads are stored by default in
```
1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-1. Migrate any existing local artifacts to the object storage:
-
- ```shell
- gitlab-rake gitlab:artifacts:migrate
- ```
+1. [Migrate any existing local artifacts to the object storage](#migrating-to-object-storage).
---
@@ -303,11 +239,55 @@ _The uploads are stored by default in
```
1. Save the file and [restart GitLab](restart_gitlab.md#installations-from-source) for the changes to take effect.
-1. Migrate any existing local artifacts to the object storage:
+1. [Migrate any existing local artifacts to the object storage](#migrating-to-object-storage).
- ```shell
- sudo -u git -H bundle exec rake gitlab:artifacts:migrate RAILS_ENV=production
- ```
+### Migrating to object storage
+
+After [configuring the object storage](#using-object-storage), use the following task to
+migrate existing job artifacts from the local storage to the remote storage.
+The processing is done in a background worker and requires **no downtime**.
+
+**In Omnibus installations:**
+
+```shell
+gitlab-rake gitlab:artifacts:migrate
+```
+
+**In installations from source:**
+
+```shell
+sudo -u git -H bundle exec rake gitlab:artifacts:migrate RAILS_ENV=production
+```
+
+You can optionally track progress and verify that all job artifacts migrated successfully using the
+[PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database):
+
+- `sudo gitlab-rails dbconsole` for Omnibus GitLab instances.
+- `sudo -u git -H psql -d gitlabhq_production` for source-installed instances.
+
+Verify `objectstg` below (where `store=2`) has count of all job artifacts:
+
+```shell
+gitlabhq_production=# SELECT count(*) AS total, sum(case when file_store = '1' then 1 else 0 end) AS filesystem, sum(case when file_store = '2' then 1 else 0 end) AS objectstg FROM ci_job_artifacts;
+
+total | filesystem | objectstg
+------+------------+-----------
+ 19 | 0 | 19
+```
+
+Verify that there are no files on disk in the `artifacts` folder:
+
+```shell
+sudo find /var/opt/gitlab/gitlab-rails/shared/artifacts -type f | grep -v tmp | wc -l
+```
+
+In some cases, you need to run the [orphan artifact file cleanup Rake task](../raketasks/cleanup.md#remove-orphan-artifact-files)
+to clean up orphaned artifacts.
+
+WARNING:
+JUnit test report artifact (`junit.xml.gz`) migration
+[was not supported until GitLab 12.8](https://gitlab.com/gitlab-org/gitlab/-/issues/27698#note_317190991)
+by the `gitlab:artifacts:migrate` Rake task.
### Migrating from object storage to local storage
@@ -503,13 +483,13 @@ If you need to manually remove job artifacts associated with multiple jobs while
- `3.months.ago`
- `1.year.ago`
- `erase_erasable_artifacts!` is a synchronous method, and upon execution, the artifacts are removed immediately.
- They are not scheduled via some background queue.
+ `erase_erasable_artifacts!` is a synchronous method, and upon execution the artifacts are immediately removed;
+ they are not scheduled by a background queue.
#### Delete job artifacts and logs from jobs completed before a specific date
WARNING:
-These commands remove data permanently from the database and from disk. We
+These commands remove data permanently from both the database and from disk. We
highly recommend running them only under the guidance of a Support Engineer, or
running them in a test environment with a backup of the instance ready to be
restored, just in case.
@@ -517,7 +497,7 @@ restored, just in case.
If you need to manually remove **all** job artifacts associated with multiple jobs,
**including job logs**, this can be done from the Rails console (`sudo gitlab-rails console`):
-1. Select jobs to be deleted:
+1. Select the jobs to be deleted:
To select jobs with artifacts for a single project:
@@ -538,7 +518,7 @@ If you need to manually remove **all** job artifacts associated with multiple jo
admin_user = User.find_by(username: 'username')
```
-1. Erase job artifacts and logs older than a specific date:
+1. Erase the job artifacts and logs older than a specific date:
```ruby
builds_to_clear = builds_with_artifacts.where("finished_at < ?", 1.week.ago)
@@ -563,34 +543,34 @@ If you need to manually remove **all** job artifacts associated with multiple jo
### Error `Downloading artifacts from coordinator... not found`
-When a job tries to download artifacts from an earlier job, you might receive an error similar to:
+When a job attempts to download artifacts from an earlier job, you might receive an error message similar to:
```plaintext
Downloading artifacts from coordinator... not found id=12345678 responseStatus=404 Not Found
```
-This might be caused by a `gitlab.rb` file with the following configuration:
+This can be caused by a `gitlab.rb` file with the following configuration:
```ruby
gitlab_rails['artifacts_object_store_background_upload'] = false
gitlab_rails['artifacts_object_store_direct_upload'] = true
```
-To prevent this, comment out or remove those lines, or switch to their [default values](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-config-template/gitlab.rb.template),
+To prevent this, comment out or remove those lines, or switch to their [default values](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-config-template/gitlab.rb.template), and
then run `sudo gitlab-ctl reconfigure`.
### Job artifact upload fails with error 500
If you are using object storage for artifacts and a job artifact fails to upload,
-you can check:
+review:
-- The job log for an error similar to:
+- The job log for an error message similar to:
```plaintext
WARNING: Uploading artifacts as "archive" to coordinator... failed id=12345 responseStatus=500 Internal Server Error status=500 token=abcd1234
```
-- The [workhorse log](logs.md#workhorse-logs) for an error similar to:
+- The [workhorse log](logs.md#workhorse-logs) for an error message similar to:
```json
{"error":"MissingRegion: could not find region configuration","level":"error","msg":"error uploading S3 session","time":"2021-03-16T22:10:55-04:00"}
diff --git a/doc/administration/lfs/index.md b/doc/administration/lfs/index.md
index d2f220e3795..3fe6a94ef13 100644
--- a/doc/administration/lfs/index.md
+++ b/doc/administration/lfs/index.md
@@ -56,7 +56,7 @@ In `config/gitlab.yml`:
## Storing LFS objects in remote object storage
You can store LFS objects in remote object storage. This allows you
-to offload reads and writes to the local disk, and free up disk space significantly.
+to reduce reads and writes to the local disk, and free up disk space significantly.
GitLab is tightly integrated with `Fog`, so you can refer to its [documentation](http://fog.io/about/provider_documentation.html)
to check which storage services can be integrated with GitLab.
You can also use external object storage in a private local network. For example,
@@ -98,32 +98,6 @@ See [the available connection settings for different providers](../object_storag
Here is a configuration example with S3.
-### Manual uploading to an object storage
-
-There are two ways to manually do the same thing as automatic uploading (described above).
-
-**Option 1: Rake task**
-
-```shell
-gitlab-rake gitlab:lfs:migrate
-```
-
-**Option 2: Rails console**
-
-Log into the Rails console:
-
-```shell
-sudo gitlab-rails console
-```
-
-Upload LFS files manually
-
-```ruby
-LfsObject.where(file_store: [nil, 1]).find_each do |lfs_object|
- lfs_object.file.migrate!(ObjectStorage::Store::REMOTE) if lfs_object.file.file.exists?
-end
-```
-
### S3 for Omnibus installations
On Omnibus GitLab installations, the settings are prefixed by `lfs_object_store_`:
@@ -146,32 +120,10 @@ On Omnibus GitLab installations, the settings are prefixed by `lfs_object_store_
```
1. Save the file, and then [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-1. Migrate any existing local LFS objects to the object storage:
-
- ```shell
- gitlab-rake gitlab:lfs:migrate
- ```
-
- This migrates existing LFS objects to object storage. New LFS objects
+1. [Migrate any existing local LFS objects to the object storage](#migrating-to-object-storage).
+ New LFS objects
are forwarded to object storage unless
`gitlab_rails['lfs_object_store_background_upload']` and `gitlab_rails['lfs_object_store_direct_upload']` is set to `false`.
-1. (Optional) Verify all files migrated properly.
- From [PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database)
- (`sudo gitlab-psql -d gitlabhq_production`) verify `objectstg` below (where `file_store=2`) has count of all artifacts:
-
- ```shell
- gitlabhq_production=# SELECT count(*) AS total, sum(case when file_store = '1' then 1 else 0 end) AS filesystem, sum(case when file_store = '2' then 1 else 0 end) AS objectstg FROM lfs_objects;
-
- total | filesystem | objectstg
- ------+------------+-----------
- 2409 | 0 | 2409
- ```
-
- Verify no files on disk in `artifacts` folder:
-
- ```shell
- sudo find /var/opt/gitlab/gitlab-rails/shared/lfs-objects -type f | grep -v tmp/cache | wc -l
- ```
### S3 for installations from source
@@ -199,31 +151,68 @@ For source installations the settings are nested under `lfs:` and then
```
1. Save the file, and then [restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect.
-1. Migrate any existing local LFS objects to the object storage:
+1. [Migrate any existing local LFS objects to the object storage](#migrating-to-object-storage).
+ New LFS objects
+ are forwarded to object storage unless
+ `background_upload` and `direct_upload` is set to `false`.
- ```shell
- sudo -u git -H bundle exec rake gitlab:lfs:migrate RAILS_ENV=production
- ```
+### Migrating to object storage
- This migrates existing LFS objects to object storage. New LFS objects
- are forwarded to object storage unless `background_upload` and `direct_upload` is set to
- `false`.
-1. (Optional) Verify all files migrated properly.
- From PostgreSQL console (`sudo -u git -H psql -d gitlabhq_production`) verify `objectstg` below (where `file_store=2`) has count of all artifacts:
+**Option 1: Rake task**
- ```shell
- gitlabhq_production=# SELECT count(*) AS total, sum(case when file_store = '1' then 1 else 0 end) AS filesystem, sum(case when file_store = '2' then 1 else 0 end) AS objectstg FROM lfs_objects;
+After [configuring the object storage](#storing-lfs-objects-in-remote-object-storage), use the following task to
+migrate existing LFS objects from the local storage to the remote storage.
+The processing is done in a background worker and requires **no downtime**.
- total | filesystem | objectstg
- ------+------------+-----------
- 2409 | 0 | 2409
- ```
+For Omnibus GitLab:
+
+```shell
+sudo gitlab-rake "gitlab:lfs:migrate"
+```
- Verify no files on disk in `artifacts` folder:
+For installations from source:
- ```shell
- sudo find /var/opt/gitlab/gitlab-rails/shared/lfs-objects -type f | grep -v tmp/cache | wc -l
- ```
+```shell
+RAILS_ENV=production sudo -u git -H bundle exec rake gitlab:lfs:migrate
+```
+
+You can optionally track progress and verify that all packages migrated successfully using the
+[PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database):
+
+- `sudo gitlab-rails dbconsole` for Omnibus GitLab instances.
+- `sudo -u git -H psql -d gitlabhq_production` for source-installed instances.
+
+Verify `objectstg` below (where `store=2`) has count of all LFS objects:
+
+```shell
+gitlabhq_production=# SELECT count(*) AS total, sum(case when file_store = '1' then 1 else 0 end) AS filesystem, sum(case when file_store = '2' then 1 else 0 end) AS objectstg FROM lfs_objects;
+
+total | filesystem | objectstg
+------+------------+-----------
+ 2409 | 0 | 2409
+```
+
+Verify that there are no files on disk in the `objects` folder:
+
+```shell
+sudo find /var/opt/gitlab/gitlab-rails/shared/lfs-objects -type f | grep -v tmp | wc -l
+```
+
+**Option 2: Rails console**
+
+Log into the Rails console:
+
+```shell
+sudo gitlab-rails console
+```
+
+Upload LFS files manually
+
+```ruby
+LfsObject.where(file_store: [nil, 1]).find_each do |lfs_object|
+ lfs_object.file.migrate!(ObjectStorage::Store::REMOTE) if lfs_object.file.file.exists?
+end
+```
### Migrating back to local storage
diff --git a/doc/administration/logs.md b/doc/administration/logs.md
index bf74a96a627..263fe699529 100644
--- a/doc/administration/logs.md
+++ b/doc/administration/logs.md
@@ -245,8 +245,6 @@ The request was processed by `Projects::TreeController`.
## `api_json.log`
-> Introduced in GitLab 10.0.
-
Depending on your installation method, this file is located at:
- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/api_json.log`
@@ -296,7 +294,7 @@ Depending on your installation method, this file is located at:
- Installations from source: `/home/git/gitlab/log/application.log`
It helps you discover events happening in your instance such as user creation
-and project removal. For example:
+and project deletion. For example:
```plaintext
October 06, 2014 11:56: User "Administrator" (admin@example.com) was created
@@ -367,8 +365,6 @@ like this example:
## `kubernetes.log`
-> Introduced in GitLab 11.6.
-
Depending on your installation method, this file is located at:
- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/kubernetes.log`
@@ -696,8 +692,6 @@ on a project.
## `importer.log`
-> Introduced in GitLab 11.3.
-
Depending on your installation method, this file is located at:
- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/importer.log`
@@ -830,7 +824,7 @@ are generated in a location based on your installation method:
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/15442) in GitLab 12.3.
-Contains details of GitLab [Database Load Balancing](database_load_balancing.md).
+Contains details of GitLab [Database Load Balancing](postgresql/database_load_balancing.md).
Depending on your installation method, this file is located at:
- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/database_load_balancing.log`
@@ -915,8 +909,6 @@ For example:
## `geo.log` **(PREMIUM SELF)**
-> Introduced in 9.5.
-
Geo stores structured log messages in a `geo.log` file. For Omnibus GitLab
installations, this file is at `/var/log/gitlab/gitlab-rails/geo.log`.
@@ -934,8 +926,6 @@ This message shows that Geo detected that a repository update was needed for pro
## `update_mirror_service_json.log`
-> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/commit/7f637e2af7006dc2b1b2649d9affc0b86cfb33c4) in GitLab 11.12.
-
Depending on your installation method, this file is located at:
- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/update_mirror_service_json.log`
@@ -1057,9 +1047,9 @@ For Omnibus GitLab installations, GitLab Monitor logs are in `/var/log/gitlab/gi
For Omnibus GitLab installations, GitLab Exporter logs are in `/var/log/gitlab/gitlab-exporter/`.
-## GitLab Kubernetes Agent Server
+## GitLab Agent Server
-For Omnibus GitLab installations, GitLab Kubernetes Agent Server logs are
+For Omnibus GitLab installations, GitLab Agent Server logs are
in `/var/log/gitlab/gitlab-kas/`.
## Praefect Logs
diff --git a/doc/administration/monitoring/gitlab_self_monitoring_project/index.md b/doc/administration/monitoring/gitlab_self_monitoring_project/index.md
index 90d0b65dbe5..1cf4e5a25ba 100644
--- a/doc/administration/monitoring/gitlab_self_monitoring_project/index.md
+++ b/doc/administration/monitoring/gitlab_self_monitoring_project/index.md
@@ -20,9 +20,8 @@ project called **Monitoring** is created:
The project is created specifically for visualizing and configuring the monitoring of your GitLab
instance.
-When the project and group are created, all administrators are added as maintainers. As an
-administrator, you can add new members to the group to give them the
-[Maintainer role](../../../user/permissions.md) for the project.
+When the project and group are created, all administrators are given the [Maintainer role](../../../user/permissions.md).
+As an administrator, you can add new members to the group to give them the Maintainer role for the project.
This project can be used to:
diff --git a/doc/administration/monitoring/performance/performance_bar.md b/doc/administration/monitoring/performance/performance_bar.md
index d798feb71a9..14a560223f9 100644
--- a/doc/administration/monitoring/performance/performance_bar.md
+++ b/doc/administration/monitoring/performance/performance_bar.md
@@ -26,11 +26,10 @@ From left to right, the performance bar displays:
details for each query:
- **In a transaction**: shows up below the query if it was executed in
the context of a transaction
- - **Role**: shows up when [database load
- balancing](../../database_load_balancing.md) is enabled. It shows
- which server role was used for the query. "Primary" means that the query
- was sent to the read/write primary server. "Replica" means it was sent
- to a read-only replica.
+ - **Role**: shows up when [Database Load Balancing](../../postgresql/database_load_balancing.md)
+ is enabled. It shows which server role was used for the query.
+ "Primary" means that the query was sent to the read/write primary server.
+ "Replica" means it was sent to a read-only replica.
- **Config name**: shows up only when the
`GITLAB_MULTIPLE_DATABASE_METRICS` environment variable is set. This is
used to distinguish between different databases configured for different
diff --git a/doc/administration/monitoring/prometheus/gitlab_metrics.md b/doc/administration/monitoring/prometheus/gitlab_metrics.md
index 2f9c1e3bc9c..4a504b07a1b 100644
--- a/doc/administration/monitoring/prometheus/gitlab_metrics.md
+++ b/doc/administration/monitoring/prometheus/gitlab_metrics.md
@@ -187,16 +187,25 @@ configuration option in `gitlab.yml`. These metrics are served from the
| `geo_repositories` | Gauge | 10.2 | Total number of repositories available on primary | `url` |
| `geo_repositories_synced` | Gauge | 10.2 | Number of repositories synced on secondary | `url` |
| `geo_repositories_failed` | Gauge | 10.2 | Number of repositories failed to sync on secondary | `url` |
-| `geo_lfs_objects` | Gauge | 10.2 | Total number of LFS objects available on primary | `url` |
-| `geo_lfs_objects_synced` | Gauge | 10.2 | Number of LFS objects synced on secondary | `url` |
-| `geo_lfs_objects_failed` | Gauge | 10.2 | Number of LFS objects failed to sync on secondary | `url` |
+| `geo_lfs_objects` | Gauge | 10.2 | Number of LFS objects on primary | `url` |
+| `geo_lfs_objects_checksummed` | Gauge | 14.6 | Number of LFS objects checksummed successfully on primary | `url` |
+| `geo_lfs_objects_checksum_failed` | Gauge | 14.6 | Number of LFS objects failed to calculate the checksum on primary | `url` |
+| `geo_lfs_objects_checksum_total` | Gauge | 14.6 | Number of LFS objects tried to checksum on primary | `url` |
+| `geo_lfs_objects_synced` | Gauge | 10.2 | Number of syncable LFS objects synced on secondary | `url` |
+| `geo_lfs_objects_failed` | Gauge | 10.2 | Number of syncable LFS objects failed to sync on secondary | `url` |
+| `geo_lfs_objects_registry` | Gauge | 14.6 | Number of LFS objects in the registry | `url` |
+| `geo_lfs_objects_verified` | Gauge | 14.6 | Number of LFS objects verified on secondary | `url` |
+| `geo_lfs_objects_verification_failed` | Gauge | 14.6 | Number of LFS objects' verifications failed on secondary | `url` |
+| `geo_lfs_objects_verification_total` | Gauge | 14.6 | Number of LFS objects' verifications tried on secondary | `url` |LFS objects failed to sync on secondary | `url` |
+| `geo_attachments` | Gauge | 10.2 | Total number of file attachments available on primary | `url` |
+| `geo_attachments_synced` | Gauge | 10.2 | Number of attachments synced on secondary | `url` |
+| `geo_attachments_failed` | Gauge | 10.2 | Number of attachments failed to sync on secondary | `url` |
| `geo_last_event_id` | Gauge | 10.2 | Database ID of the latest event log entry on the primary | `url` |
| `geo_last_event_timestamp` | Gauge | 10.2 | UNIX timestamp of the latest event log entry on the primary | `url` |
| `geo_cursor_last_event_id` | Gauge | 10.2 | Last database ID of the event log processed by the secondary | `url` |
| `geo_cursor_last_event_timestamp` | Gauge | 10.2 | Last UNIX timestamp of the event log processed by the secondary | `url` |
| `geo_status_failed_total` | Counter | 10.2 | Number of times retrieving the status from the Geo Node failed | `url` |
| `geo_last_successful_status_check_timestamp` | Gauge | 10.2 | Last timestamp when the status was successfully updated | `url` |
-| `geo_lfs_objects_synced_missing_on_primary` | Gauge | 10.7 | Number of LFS objects marked as synced due to the file missing on the primary | `url` |
| `geo_job_artifacts_synced_missing_on_primary` | Gauge | 10.7 | Number of job artifacts marked as synced due to the file missing on the primary | `url` |
| `geo_repositories_checksummed` | Gauge | 10.7 | Number of repositories checksummed on primary | `url` |
| `geo_repositories_checksum_failed` | Gauge | 10.7 | Number of repositories failed to calculate the checksum on primary | `url` |
@@ -253,15 +262,15 @@ configuration option in `gitlab.yml`. These metrics are served from the
| `geo_group_wiki_repositories_failed` | Gauge | 13.10 | Number of syncable group wikis failed on secondary | `url` |
| `geo_group_wiki_repositories_registry` | Gauge | 13.10 | Number of syncable group wikis in the registry | `url` |
| `geo_pages_deployments` | Gauge | 14.3 | Number of pages deployments on primary | `url` |
-| `geo_pages_deployments_checksum_total` | Gauge | 14.3 | Number of pages deployments tried to checksum on primary | `url` |
-| `geo_pages_deployments_checksummed` | Gauge | 14.3 | Number of pages deployments successfully checksummed on primary | `url` |
-| `geo_pages_deployments_checksum_failed` | Gauge | 14.3 | Number of pages deployments failed to calculate the checksum on primary | `url` |
+| `geo_pages_deployments_checksum_total` | Gauge | 14.6 | Number of pages deployments tried to checksum on primary | `url` |
+| `geo_pages_deployments_checksummed` | Gauge | 14.6 | Number of pages deployments successfully checksummed on primary | `url` |
+| `geo_pages_deployments_checksum_failed` | Gauge | 14.6 | Number of pages deployments failed to calculate the checksum on primary | `url` |
| `geo_pages_deployments_synced` | Gauge | 14.3 | Number of syncable pages deployments synced on secondary | `url` |
| `geo_pages_deployments_failed` | Gauge | 14.3 | Number of syncable pages deployments failed to sync on secondary | `url` |
| `geo_pages_deployments_registry` | Gauge | 14.3 | Number of pages deployments in the registry | `url` |
-| `geo_pages_deployments_verification_total` | Gauge | 14.3 | Number of pages deployments verifications tried on secondary | `url` |
-| `geo_pages_deployments_verified` | Gauge | 14.3 | Number of pages deployments verified on secondary | `url` |
-| `geo_pages_deployments_verification_failed` | Gauge | 14.3 | Number of pages deployments verifications failed on secondary | `url` |
+| `geo_pages_deployments_verification_total` | Gauge | 14.6 | Number of pages deployments verifications tried on secondary | `url` |
+| `geo_pages_deployments_verified` | Gauge | 14.6 | Number of pages deployments verified on secondary | `url` |
+| `geo_pages_deployments_verification_failed` | Gauge | 14.6 | Number of pages deployments verifications failed on secondary | `url` |
| `limited_capacity_worker_running_jobs` | Gauge | 13.5 | Number of running jobs | `worker` |
| `limited_capacity_worker_max_running_jobs` | Gauge | 13.5 | Maximum number of running jobs | `worker` |
| `limited_capacity_worker_remaining_work_count` | Gauge | 13.5 | Number of jobs waiting to be enqueued | `worker` |
@@ -272,6 +281,12 @@ configuration option in `gitlab.yml`. These metrics are served from the
| `geo_uploads_synced` | Gauge | 14.1 | Number of uploads synced on secondary | `url` |
| `geo_uploads_failed` | Gauge | 14.1 | Number of syncable uploads failed to sync on secondary | `url` |
| `geo_uploads_registry` | Gauge | 14.1 | Number of uploads in the registry | `url` |
+| `geo_uploads_checksum_total` | Gauge | 14.6 | Number of uploads tried to checksum on primary | `url` |
+| `geo_uploads_checksummed` | Gauge | 14.6 | Number of uploads successfully checksummed on primary | `url` |
+| `geo_uploads_checksum_failed` | Gauge | 14.6 | Number of uploads failed to calculate the checksum on primary | `url` |
+| `geo_uploads_verification_total` | Gauge | 14.6 | Number of uploads verifications tried on secondary | `url` |
+| `geo_uploads_verified` | Gauge | 14.6 | Number of uploads verified on secondary | `url` |
+| `geo_uploads_verification_failed` | Gauge | 14.6 | Number of uploads verifications failed on secondary | `url` |
| `gitlab_sli:rails_request_apdex:total` | Counter | 14.4 | The number of request-apdex measurements, [more information the development documentation](../../../development/application_slis/rails_request_apdex.md) | `endpoint_id`, `feature_category`, `request_urgency` |
| `gitlab_sli:rails_request_apdex:success_total` | Counter | 14.4 | The number of succesful requests that met the target duration for their urgency. Devide by `gitlab_sli:rails_requests_apdex:total` to get a success ratio | `endpoint_id`, `feature_category`, `request_urgency` |
diff --git a/doc/administration/monitoring/prometheus/index.md b/doc/administration/monitoring/prometheus/index.md
index e86ca596955..3268c0fc14c 100644
--- a/doc/administration/monitoring/prometheus/index.md
+++ b/doc/administration/monitoring/prometheus/index.md
@@ -259,9 +259,10 @@ To use an external Prometheus server:
- 1.1.1.1:9229
- job_name: gitlab-rails
metrics_path: "/-/metrics"
+ scheme: https
static_configs:
- targets:
- - 1.1.1.1:8080
+ - 1.1.1.1
- job_name: gitlab-sidekiq
static_configs:
- targets:
@@ -287,6 +288,11 @@ To use an external Prometheus server:
- 1.1.1.1:9236
```
+ WARNING:
+ The `gitlab-rails` job in the snippet assumes that GitLab is reachable through HTTPS. If your
+ deployment doesn't use HTTPS, the job configuration is adapted to use the `http` scheme and port
+ 80.
+
1. Reload the Prometheus server.
## Viewing performance metrics
diff --git a/doc/administration/nfs.md b/doc/administration/nfs.md
index 2a2e9f05312..a0170e6c4ef 100644
--- a/doc/administration/nfs.md
+++ b/doc/administration/nfs.md
@@ -20,31 +20,31 @@ file system performance, see
## Gitaly and NFS deprecation
-Starting with GitLab version 14.0, support for NFS to store Git repository data will be deprecated. Technical customer support and engineering support will be available for the 14.x releases. Engineering will fix bugs and security vulnerabilities consistent with our [release and maintenance policy](../policy/maintenance.md#security-releases).
+Starting with GitLab version 14.0, support for NFS to store Git repository data is deprecated. Technical customer support and engineering support is available for the 14.x releases. Engineering is fixing bugs and security vulnerabilities consistent with our [release and maintenance policy](../policy/maintenance.md#security-releases).
-At the end of the 14.12 milestone (tenatively June 22nd, 2022) technical and engineering support for using NFS to store Git repository data will be officially at end-of-life. There will be no product changes or troubleshooting provided via Engineering, Security or Paid Support channels.
+At the end of the 14.12 milestone (tentatively June 22nd, 2022) technical and engineering support for using NFS to store Git repository data will be officially at end-of-life. There will be no product changes or troubleshooting provided via Engineering, Security or Paid Support channels.
For those customers still running earlier versions of GitLab, [our support eligibility and maintenance policy applies](https://about.gitlab.com/support/statement-of-support.html#version-support).
-For the 14.x releases, we will continue to help with Git related tickets from customers running one or more Gitaly servers with its data stored on NFS. Examples may include:
+For the 14.x releases, we continue to help with Git related tickets from customers running one or more Gitaly servers with its data stored on NFS. Examples may include:
- Performance issues or timeouts accessing Git data
- Commits or branches vanish
- GitLab intermittently returns the wrong Git data (such as reporting that a repository has no branches)
-Assistance will be limited to activities like:
+Assistance is limited to activities like:
- Verifying developers' workflow uses features like protected branches
- Reviewing GitLab event data from the database to advise if it looks like a force push over-wrote branches
- Verifying that NFS client mount options match our [documented recommendations](#mount-options)
- Analyzing the GitLab Workhorse and Rails logs, and determining that `500` errors being seen in the environment are caused by slow responses from Gitaly
-GitLab support will be unable to continue with the investigation if:
+GitLab support is unable to continue with the investigation if:
- The date of the request is on or after the release of GitLab version 15.0, and
- Support Engineers and Management determine that all reasonable non-NFS root causes have been exhausted
-If the issue is reproducible, or if it happens intermittently but regularly, GitLab Support will investigate providing the issue reproduces without the use of NFS. In order to reproduce without NFS, the affected repositories should be migrated to a different Gitaly shard, such as Gitaly cluster or a standalone Gitaly VM, backed with block storage.
+If the issue is reproducible, or if it happens intermittently but regularly, GitLab Support can investigate providing the issue reproduces without the use of NFS. In order to reproduce without NFS, the affected repositories should be migrated to a different Gitaly shard, such as Gitaly cluster or a standalone Gitaly VM, backed with block storage.
### Why remove NFS for Git repository data
@@ -438,7 +438,7 @@ the file system access GitLab requires. Workloads where many small files are wri
a serialized manner, like `git`, are not well suited to cloud-based file systems.
If you do choose to use these, avoid storing GitLab log files (for example, those in `/var/log/gitlab`)
-there because this will also affect performance. We recommend that the log files be
+there because this also affects performance. We recommend that the log files be
stored on a local volume.
For more details on the experience of using a cloud-based file systems with GitLab,
@@ -447,12 +447,12 @@ see this [Commit Brooklyn 2019 video](https://youtu.be/K6OS8WodRBQ?t=313).
### Avoid using CephFS and GlusterFS
GitLab strongly recommends against using CephFS and GlusterFS.
-These distributed file systems are not well-suited for the GitLab input/output access patterns because Git uses many small files and access times and file locking times to propagate will make Git activity very slow.
+These distributed file systems are not well-suited for the GitLab input/output access patterns because Git uses many small files and access times and file locking times to propagate makes Git activity very slow.
### Avoid using PostgreSQL with NFS
GitLab strongly recommends against running your PostgreSQL database
-across NFS. The GitLab support team will not be able to assist on performance issues related to
+across NFS. The GitLab support team is not able to assist on performance issues related to
this configuration.
Additionally, this configuration is specifically warned against in the
diff --git a/doc/administration/object_storage.md b/doc/administration/object_storage.md
index 8576b429213..c6490e365a5 100644
--- a/doc/administration/object_storage.md
+++ b/doc/administration/object_storage.md
@@ -281,6 +281,9 @@ The service account must have permission to access the bucket. Learn more
in Google's
[Cloud Storage authentication documentation](https://cloud.google.com/storage/docs/authentication).
+NOTE:
+Bucket encryption with the [Cloud Key Management Service (KMS)](https://cloud.google.com/kms/docs) is not supported and will result in [ETag mismatch errors](#etag-mismatch).
+
##### Google example (consolidated form)
For Omnibus installations, this is an example of the `connection` setting:
@@ -354,7 +357,7 @@ gitlab_rails['object_store']['connection'] = {
'provider' => 'AzureRM',
'azure_storage_account_name' => '<AZURE STORAGE ACCOUNT NAME>',
'azure_storage_access_key' => '<AZURE STORAGE ACCESS KEY>',
- 'azure_storage_domain' => '<AZURE STORAGE DOMAIN>',
+ 'azure_storage_domain' => '<AZURE STORAGE DOMAIN>'
}
```
@@ -682,6 +685,8 @@ With the consolidated object configuration and instance profile, Workhorse has
S3 credentials so that it can compute the `Content-MD5` header. This
eliminates the need to compare ETag headers returned from the S3 server.
+Encrypting buckets with GCS' [Cloud Key Management Service (KMS)](https://cloud.google.com/kms/docs) is not supported and will result in ETag mismatch errors.
+
### Using Amazon instance profiles
Instead of supplying AWS access and secret keys in object storage
diff --git a/doc/administration/operations/extra_sidekiq_processes.md b/doc/administration/operations/extra_sidekiq_processes.md
index 02cb7ad0bca..1c9b98041dc 100644
--- a/doc/administration/operations/extra_sidekiq_processes.md
+++ b/doc/administration/operations/extra_sidekiq_processes.md
@@ -263,9 +263,9 @@ This sets the concurrency (number of threads) for the Sidekiq process.
## Modify the check interval
-To modify the check interval for the additional Sidekiq processes:
+To modify `sidekiq-cluster`'s health check interval for the additional Sidekiq processes:
-1. Edit `/etc/gitlab/gitlab.rb` and add:
+1. Edit `/etc/gitlab/gitlab.rb` and add (the value can be any integer number of seconds):
```ruby
sidekiq['interval'] = 5
@@ -273,8 +273,6 @@ To modify the check interval for the additional Sidekiq processes:
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-This tells the additional processes how often to check for enqueued jobs.
-
## Troubleshoot using the CLI
WARNING:
diff --git a/doc/administration/operations/moving_repositories.md b/doc/administration/operations/moving_repositories.md
index 9cf7ac18c81..84e6dca1f2b 100644
--- a/doc/administration/operations/moving_repositories.md
+++ b/doc/administration/operations/moving_repositories.md
@@ -42,7 +42,12 @@ To move repositories into a [Gitaly Cluster](../gitaly/index.md#gitaly-cluster)
WARNING:
Repositories can be **permanently deleted** by a call to `/projects/:project_id/repository_storage_moves`
that attempts to move a project already stored in a Gitaly Cluster back into that cluster.
-See [this issue for more details](https://gitlab.com/gitlab-org/gitaly/-/issues/3752).
+See [this issue for more details](https://gitlab.com/gitlab-org/gitaly/-/issues/3752). This was fixed in
+GitLab 14.3.0 and backported to
+[14.2.4](https://about.gitlab.com/releases/2021/09/17/gitlab-14-2-4-released/),
+[14.1.6](https://about.gitlab.com/releases/2021/09/27/gitlab-14-1-6-released/),
+[14.0.11](https://about.gitlab.com/releases/2021/09/27/gitlab-14-0-11-released/), and
+[13.12.12](https://about.gitlab.com/releases/2021/09/22/gitlab-13-12-12-released/).
Each repository is made read-only for the duration of the move. The repository is not writable
until the move has completed.
diff --git a/doc/administration/operations/puma.md b/doc/administration/operations/puma.md
index f1f02b606f5..c7df8249ae4 100644
--- a/doc/administration/operations/puma.md
+++ b/doc/administration/operations/puma.md
@@ -113,7 +113,7 @@ is used when Puma is enabled.
NOTE:
Unlike Unicorn, the `puma['worker_timeout']` setting does not set the maximum request duration.
-To change the worker timeout:
+To change the worker timeout to 600 seconds:
1. Edit `/etc/gitlab/gitlab.rb`:
diff --git a/doc/administration/package_information/deprecated_os.md b/doc/administration/package_information/deprecated_os.md
index 7234d68e4b2..1f6fe0fce5d 100644
--- a/doc/administration/package_information/deprecated_os.md
+++ b/doc/administration/package_information/deprecated_os.md
@@ -1,83 +1,9 @@
---
-stage: Enablement
-group: Distribution
-info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+redirect_to: 'supported_os.md'
+remove_date: '2022-02-18'
---
-# OS Versions that are no longer supported **(FREE SELF)**
+This document was moved to [another location](supported_os.md).
-GitLab provides omnibus packages for operating systems only until their
-EOL (End-Of-Life). After the EOL date of the OS, GitLab will stop releasing
-official packages. The list of deprecated operating systems and the final GitLab
-release for them can be found below:
-
-| OS Version | End Of Life | Last supported GitLab version |
-| --------------- | ---------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| Raspbian Wheezy | [May 2015](https://downloads.raspberrypi.org/raspbian/images/raspbian-2015-05-07/) | [GitLab CE](https://packages.gitlab.com/app/gitlab/raspberry-pi2/search?q=gitlab-ce_8.17&dist=debian%2Fwheezy) 8.17 |
-| OpenSUSE 13.2 | [January 2017](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-9.1&dist=opensuse%2F13.2) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-9.1&dist=opensuse%2F13.2) 9.1 |
-| Ubuntu 12.04 | [April 2017](https://ubuntu.com/info/release-end-of-life) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_9.1&dist=ubuntu%2Fprecise) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_9.1&dist=ubuntu%2Fprecise) 9.1 |
-| OpenSUSE 42.1 | [May 2017](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-9.3&dist=opensuse%2F42.1) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-9.3&dist=opensuse%2F42.1) 9.3 |
-| OpenSUSE 42.2 | [January 2018](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-10.4&dist=opensuse%2F42.2) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-10.4&dist=opensuse%2F42.2) 10.4 |
-| Debian Wheezy | [May 2018](https://www.debian.org/News/2018/20180601) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_11.6&dist=debian%2Fwheezy) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_11.6&dist=debian%2Fwheezy) 11.6 |
-| Raspbian Jessie | [May 2017](https://downloads.raspberrypi.org/raspbian/images/raspbian-2017-07-05/) | [GitLab CE](https://packages.gitlab.com/app/gitlab/raspberry-pi2/search?q=gitlab-ce_11.7&dist=debian%2Fjessie) 11.7 |
-| Ubuntu 14.04 | [April 2019](https://ubuntu.com/info/release-end-of-life) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_11.10&dist=ubuntu%2Ftrusty) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_11.10&dist=ubuntu%2Ftrusty) 11.10 |
-| OpenSUSE 42.3 | [July 2019](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-12.1&dist=opensuse%2F42.3) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-12.1&dist=opensuse%2F42.3) 12.1 |
-| OpenSUSE 15.0 | [December 2019](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-12.5&dist=opensuse%2F15.0) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-12.5&dist=opensuse%2F15.0) 12.5 |
-| Raspbian Stretch | [June 2020](https://downloads.raspberrypi.org/raspbian/images/raspbian-2019-04-09/) | [GitLab CE](https://packages.gitlab.com/app/gitlab/raspberry-pi2/search?q=gitlab-ce_13.2&dist=raspbian%2Fstretch) 13.3 |
-| Debian Jessie | [June 2020](https://www.debian.org/News/2020/20200709) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_13.2&dist=debian%2Fjessie) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_13.2&dist=debian%2Fjessie) 13.3 |
-| CentOS 6 | [November 2020](https://wiki.centos.org/About/Product) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=13.6&filter=all&filter=all&dist=el%2F6) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=13.6&filter=all&filter=all&dist=el%2F6) 13.6 |
-| OpenSUSE 15.1 | [November 2020](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-13.12&dist=opensuse%2F15.1) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-13.12&dist=opensuse%2F15.2) 13.12 |
-| Ubuntu 16.04 | [April 2021](https://ubuntu.com/info/release-end-of-life) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_13.12&dist=ubuntu%2Fxenial) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_13.12&dist=ubuntu%2Fxenial) 13.12 |
-
-NOTE:
-An exception to this deprecation policy is when we are unable to provide
-packages for the next version of the operating system. The most common reason
-for this our package repository provider, Packagecloud, not supporting newer
-versions and hence we can't upload packages to it.
-
-## Update GitLab package sources after upgrading the OS
-
-After upgrading the Operating System (OS) as per its own documentation,
-it may be necessary to also update the GitLab package source URL
-in your package manager configuration.
-If your package manager reports that no further updates are available,
-although [new versions have been released](https://about.gitlab.com/releases/categories/releases/), repeat the
-"Add the GitLab package repository" instructions
-of the [Linux package install guide](https://about.gitlab.com/install/#content).
-Future GitLab upgrades will now be fetched according to your upgraded OS.
-
-## Supported Operating Systems
-
-GitLab officially supports LTS versions of operating systems. While OSs like
-Ubuntu have a clear distinction between LTS and non-LTS versions, there are
-other OSs, openSUSE for example, that don't follow the LTS concept. Hence to
-avoid confusion, the official policy is that at any point of time, all the
-operating systems supported by GitLab are listed in the [installation
-page](https://about.gitlab.com/install/).
-
-The following lists the currently supported OSs and their possible EOL dates.
-
-| OS Version | First supported GitLab version | Arch | OS EOL | Details |
-| ---------------- | ------------------------------ | --------------- | ------------- | ------------------------------------------------------------ |
-| CentOS 7 | GitLab CE / GitLab EE 7.10.0 | x86_64 | June 2024 | <https://wiki.centos.org/About/Product> |
-| CentOS 8 | GitLab CE / GitLab EE 12.8.1 | x86_64, aarch64 | Dec 2021 | <https://wiki.centos.org/About/Product> |
-| Debian 9 | GitLab CE / GitLab EE 9.3.0 | amd64 | 2022 | <https://wiki.debian.org/DebianReleases#Production_Releases> |
-| Debian 10 | GitLab CE / GitLab EE 12.2.0 | amd64, arm64 | TBD | <https://wiki.debian.org/DebianReleases#Production_Releases> |
-| OpenSUSE 15.2 | GitLab CE / GitLab EE 13.11.0 | x86_64, aarch64 | Dec 2021 | <https://en.opensuse.org/Lifetime> |
-| OpenSUSE 15.3 | GitLab CE / GitLab EE 14.5.0 | x86_64, aarch64 | Nov 2022 | <https://en.opensuse.org/Lifetime> |
-| SLES 12 | GitLab EE 9.0.0 | x86_64 | Oct 2027 | <https://www.suse.com/lifecycle/> |
-| Ubuntu 18.04 | GitLab CE / GitLab EE 10.7.0 | amd64 | April 2023 | <https://wiki.ubuntu.com/Releases> |
-| Ubuntu 20.04 | GitLab CE / GitLab EE 13.2.0 | amd64, arm64 | April 2025 | <https://wiki.ubuntu.com/Releases> |
-| Raspbian Buster | GitLab CE 12.2.0 | armhf | 2022 | <https://wiki.debian.org/DebianReleases#Production_Releases> |
-
-### Packages for ARM64
-
-> [Introduced](https://gitlab.com/gitlab-org/gitlab-omnibus-builder/-/issues/27) in GitLab 13.4.
-
-GitLab provides arm64/aarch64 packages for some supported operating systems.
-You can see if your operating system architecture is supported in the table
-above.
-
-WARNING:
-There are currently still some [known issues and limitation](https://gitlab.com/groups/gitlab-org/-/epics/4397)
-running GitLab on ARM.
+<!-- This redirect file can be deleted after <2022-02-18>. -->
+<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
diff --git a/doc/administration/package_information/deprecation_policy.md b/doc/administration/package_information/deprecation_policy.md
index d45c2ea3127..905de387dcb 100644
--- a/doc/administration/package_information/deprecation_policy.md
+++ b/doc/administration/package_information/deprecation_policy.md
@@ -16,18 +16,18 @@ setup, various configuration requires removal.
### Policy
-The Omnibus GitLab package will retain configuration for at least **one major**
-version. We cannot guarantee that deprecated configuration
-will be available in the next major release. See [example](#example) for more details.
+The Omnibus GitLab package retains configuration for at least **one major**
+version. We can't guarantee that deprecated configuration
+is available in the next major release. See [example](#example) for more details.
### Notice
-If the configuration becomes obsolete, we will announce the deprecation:
+If the configuration becomes obsolete, we announce the deprecation:
- via release blog post on `https://about.gitlab.com/blog/`. The blog post item
- will contain the deprecation notice together with the target removal date.
+ contains the deprecation notice together with the target removal date.
- via installation/reconfigure output (if applicable).
-- via official documentation on `https://docs.gitlab.com/`. The documentation update will contain the corrected syntax (if applicable) or a date of configuration removal.
+- via official documentation on `https://docs.gitlab.com/`. The documentation update contains the corrected syntax (if applicable) or a date of configuration removal.
### Procedure
@@ -82,16 +82,16 @@ The final comment in the issue **has to have**:
## Example
-User configuration available in `/etc/gitlab/gitlab.rb` was introduced in GitLab version 10.0, `gitlab_rails['configuration'] = true`. In GitLab version 10.4.0, a new change was introduced that requires rename of this configuration option. New configuration option is `gitlab_rails['better_configuration'] = true`. Development team will translate the old configuration into new one
-and trigger a deprecation procedure.
+User configuration available in `/etc/gitlab/gitlab.rb` was introduced in GitLab version 10.0, `gitlab_rails['configuration'] = true`. In GitLab version 10.4.0, a new change was introduced that requires rename of this configuration option. New configuration option is `gitlab_rails['better_configuration'] = true`. Development team translates the old configuration into a new one
+and triggers a deprecation procedure.
This means that these two configuration
-options will both be valid through GitLab version 10. In other words,
+options are valid through GitLab version 10. In other words,
if you still have `gitlab_rails['configuration'] = true` set in GitLab 10.8.0
-the feature will continue working the same way as if you had `gitlab_rails['better_configuration'] = true` set.
-However, setting the old version of configuration will print out a deprecation
+the feature continues working the same way as if you had `gitlab_rails['better_configuration'] = true` set.
+However, setting the old version of the configuration prints out a deprecation
notice at the end of installation/upgrade/reconfigure run.
-With GitLab 11, `gitlab_rails['configuration'] = true` will no longer work and you will have to manually change the configuration in `/etc/gitlab/gitlab.rb` to the new valid configuration.
+In GitLab 11, `gitlab_rails['configuration'] = true` no longer works and you must manually change the configuration in `/etc/gitlab/gitlab.rb` to the new valid configuration.
**Note** If this configuration option is sensitive and can put integrity of the installation or
-data in danger, installation/upgrade will be aborted.
+data in danger,the installation or upgrade is aborted.
diff --git a/doc/administration/package_information/index.md b/doc/administration/package_information/index.md
index 12f3274ecab..ab4b1edfa30 100644
--- a/doc/administration/package_information/index.md
+++ b/doc/administration/package_information/index.md
@@ -18,7 +18,7 @@ The released package versions are in the format `MAJOR.MINOR.PATCH-EDITION.OMNIB
|-------------------|---------|---------|
| MAJOR.MINOR.PATCH | The GitLab version this corresponds to. | 13.3.0 |
| EDITION | The edition of GitLab this corresponds to. | ee |
-| OMNIBUS_RELEASE | The Omnibus GitLab release. Usually, this will be 0. This is incremented if we need to build a new package without changing the GitLab version. | 0 |
+| OMNIBUS_RELEASE | The Omnibus GitLab release. Usually, this is 0. This is incremented if we need to build a new package without changing the GitLab version. | 0 |
## Licenses
@@ -27,7 +27,7 @@ See [licensing](licensing.md)
## Defaults
The Omnibus GitLab package requires various configuration to get the components
-in working order. If the configuration is not provided, the package will use
+in working order. If the configuration is not provided, the package uses
the default values assumed in the package.
These defaults are noted in the package [defaults document](defaults.md).
@@ -59,8 +59,8 @@ accidental overwrite of user configuration provided in `/etc/gitlab/gitlab.rb`.
New configuration options are noted in the
[`gitlab.rb.template` file](https://gitlab.com/gitlab-org/omnibus-gitlab/raw/master/files/gitlab-config-template/gitlab.rb.template).
-The Omnibus GitLab package also provides convenience command which will
-compare the existing user configuration with the latest version of the
+The Omnibus GitLab package also provides convenience command which
+compares the existing user configuration with the latest version of the
template contained in the package.
To view a diff between your configuration file and the latest version, run:
@@ -76,7 +76,7 @@ characters on each line.
## Init system detection
-Omnibus GitLab will attempt to query the underlaying system in order to
+Omnibus GitLab attempts to query the underlaying system in order to
check which init system it uses.
This manifests itself as a `WARNING` during the `sudo gitlab-ctl reconfigure`
run.
diff --git a/doc/administration/package_information/supported_os.md b/doc/administration/package_information/supported_os.md
new file mode 100644
index 00000000000..fcc2fef3e63
--- /dev/null
+++ b/doc/administration/package_information/supported_os.md
@@ -0,0 +1,90 @@
+---
+stage: Enablement
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Supported operating systems **(FREE SELF)**
+
+GitLab officially supports LTS versions of operating systems. While OSs like
+Ubuntu have a clear distinction between LTS and non-LTS versions, there are
+other OSs, openSUSE for example, that don't follow the LTS concept. Hence to
+avoid confusion, the official policy is that at any point of time, all the
+operating systems supported by GitLab are listed in the [installation
+page](https://about.gitlab.com/install/).
+
+The following lists the currently supported OSs and their possible EOL dates.
+
+| OS Version | First supported GitLab version | Arch | OS EOL | Details |
+| ---------------- | ------------------------------ | --------------- | ------------- | ------------------------------------------------------------ |
+| CentOS 7 | GitLab CE / GitLab EE 7.10.0 | x86_64 | June 2024 | <https://wiki.centos.org/About/Product> |
+| CentOS 8 | GitLab CE / GitLab EE 12.8.1 | x86_64, aarch64 | Dec 2021 | <https://wiki.centos.org/About/Product> |
+| Debian 9 | GitLab CE / GitLab EE 9.3.0 | amd64 | 2022 | <https://wiki.debian.org/LTS> |
+| Debian 10 | GitLab CE / GitLab EE 12.2.0 | amd64, arm64 | 2024 | <https://wiki.debian.org/LTS> |
+| Debian 11 | GitLab CE / GitLab EE 14.6.0 | amd64, arm64 | 2026 | <https://wiki.debian.org/LTS> |
+| OpenSUSE 15.2 | GitLab CE / GitLab EE 13.11.0 | x86_64, aarch64 | Dec 2021 | <https://en.opensuse.org/Lifetime> |
+| OpenSUSE 15.3 | GitLab CE / GitLab EE 14.5.0 | x86_64, aarch64 | Nov 2022 | <https://en.opensuse.org/Lifetime> |
+| SLES 12 | GitLab EE 9.0.0 | x86_64 | Oct 2027 | <https://www.suse.com/lifecycle/> |
+| Ubuntu 18.04 | GitLab CE / GitLab EE 10.7.0 | amd64 | April 2023 | <https://wiki.ubuntu.com/Releases> |
+| Ubuntu 20.04 | GitLab CE / GitLab EE 13.2.0 | amd64, arm64 | April 2025 | <https://wiki.ubuntu.com/Releases> |
+| Raspbian Buster | GitLab CE 12.2.0 | armhf | 2022 | <https://wiki.debian.org/DebianReleases#Production_Releases> |
+
+NOTE:
+CentOS 8 will be EOL on December 31, 2021. In GitLab 14.5 and later,
+[CentOS builds work in AlmaLinux](https://gitlab.com/gitlab-org/distribution/team-tasks/-/issues/954#note_730198505).
+We will officially support all distributions that are binary compatible with Red Hat Enterprise Linux.
+This gives users a path forward for their CentOS 8 builds at its end of life.
+
+## Update GitLab package sources after upgrading the OS
+
+After upgrading the Operating System (OS) as per its own documentation,
+it may be necessary to also update the GitLab package source URL
+in your package manager configuration.
+If your package manager reports that no further updates are available,
+although [new versions have been released](https://about.gitlab.com/releases/categories/releases/), repeat the
+"Add the GitLab package repository" instructions
+of the [Linux package install guide](https://about.gitlab.com/install/#content).
+Future GitLab upgrades will now be fetched according to your upgraded OS.
+
+## Packages for ARM64
+
+> [Introduced](https://gitlab.com/gitlab-org/gitlab-omnibus-builder/-/issues/27) in GitLab 13.4.
+
+GitLab provides arm64/aarch64 packages for some supported operating systems.
+You can see if your operating system architecture is supported in the table
+above.
+
+WARNING:
+There are currently still some [known issues and limitation](https://gitlab.com/groups/gitlab-org/-/epics/4397)
+running GitLab on ARM.
+
+## OS Versions that are no longer supported
+
+GitLab provides omnibus packages for operating systems only until their
+EOL (End-Of-Life). After the EOL date of the OS, GitLab will stop releasing
+official packages. The list of deprecated operating systems and the final GitLab
+release for them can be found below:
+
+| OS Version | End Of Life | Last supported GitLab version |
+| --------------- | ---------------------------------------------------------------------------------- | -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Raspbian Wheezy | [May 2015](https://downloads.raspberrypi.org/raspbian/images/raspbian-2015-05-07/) | [GitLab CE](https://packages.gitlab.com/app/gitlab/raspberry-pi2/search?q=gitlab-ce_8.17&dist=debian%2Fwheezy) 8.17 |
+| OpenSUSE 13.2 | [January 2017](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-9.1&dist=opensuse%2F13.2) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-9.1&dist=opensuse%2F13.2) 9.1 |
+| Ubuntu 12.04 | [April 2017](https://ubuntu.com/info/release-end-of-life) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_9.1&dist=ubuntu%2Fprecise) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_9.1&dist=ubuntu%2Fprecise) 9.1 |
+| OpenSUSE 42.1 | [May 2017](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-9.3&dist=opensuse%2F42.1) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-9.3&dist=opensuse%2F42.1) 9.3 |
+| OpenSUSE 42.2 | [January 2018](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-10.4&dist=opensuse%2F42.2) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-10.4&dist=opensuse%2F42.2) 10.4 |
+| Debian Wheezy | [May 2018](https://www.debian.org/News/2018/20180601) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_11.6&dist=debian%2Fwheezy) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_11.6&dist=debian%2Fwheezy) 11.6 |
+| Raspbian Jessie | [May 2017](https://downloads.raspberrypi.org/raspbian/images/raspbian-2017-07-05/) | [GitLab CE](https://packages.gitlab.com/app/gitlab/raspberry-pi2/search?q=gitlab-ce_11.7&dist=debian%2Fjessie) 11.7 |
+| Ubuntu 14.04 | [April 2019](https://ubuntu.com/info/release-end-of-life) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_11.10&dist=ubuntu%2Ftrusty) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_11.10&dist=ubuntu%2Ftrusty) 11.10 |
+| OpenSUSE 42.3 | [July 2019](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-12.1&dist=opensuse%2F42.3) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-12.1&dist=opensuse%2F42.3) 12.1 |
+| OpenSUSE 15.0 | [December 2019](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-12.5&dist=opensuse%2F15.0) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-12.5&dist=opensuse%2F15.0) 12.5 |
+| Raspbian Stretch | [June 2020](https://downloads.raspberrypi.org/raspbian/images/raspbian-2019-04-09/) | [GitLab CE](https://packages.gitlab.com/app/gitlab/raspberry-pi2/search?q=gitlab-ce_13.2&dist=raspbian%2Fstretch) 13.3 |
+| Debian Jessie | [June 2020](https://www.debian.org/News/2020/20200709) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_13.2&dist=debian%2Fjessie) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_13.2&dist=debian%2Fjessie) 13.3 |
+| CentOS 6 | [November 2020](https://wiki.centos.org/About/Product) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=13.6&filter=all&filter=all&dist=el%2F6) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=13.6&filter=all&filter=all&dist=el%2F6) 13.6 |
+| OpenSUSE 15.1 | [November 2020](https://en.opensuse.org/Lifetime#Discontinued_distributions) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce-13.12&dist=opensuse%2F15.1) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee-13.12&dist=opensuse%2F15.2) 13.12 |
+| Ubuntu 16.04 | [April 2021](https://ubuntu.com/info/release-end-of-life) | [GitLab CE](https://packages.gitlab.com/app/gitlab/gitlab-ce/search?q=gitlab-ce_13.12&dist=ubuntu%2Fxenial) / [GitLab EE](https://packages.gitlab.com/app/gitlab/gitlab-ee/search?q=gitlab-ee_13.12&dist=ubuntu%2Fxenial) 13.12 |
+
+NOTE:
+An exception to this deprecation policy is when we are unable to provide
+packages for the next version of the operating system. The most common reason
+for this our package repository provider, PackageCloud, not supporting newer
+versions and hence we can't upload packages to it.
diff --git a/doc/administration/packages/container_registry.md b/doc/administration/packages/container_registry.md
index 7e711bb5740..0877fe510de 100644
--- a/doc/administration/packages/container_registry.md
+++ b/doc/administration/packages/container_registry.md
@@ -436,7 +436,7 @@ To configure the `s3` storage driver in Omnibus:
```
If using with an [AWS S3 VPC endpoint](https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html),
- then set `regionendpoint` to your VPC endpoint address and set `path_style` to false:
+ then set `regionendpoint` to your VPC endpoint address and set `pathstyle` to false:
```ruby
registry['storage'] = {
@@ -446,7 +446,7 @@ To configure the `s3` storage driver in Omnibus:
'bucket' => 'your-s3-bucket',
'region' => 'your-s3-region',
'regionendpoint' => 'your-s3-vpc-endpoint',
- 'path_style' => false
+ 'pathstyle' => false
}
}
```
@@ -454,7 +454,7 @@ To configure the `s3` storage driver in Omnibus:
- `regionendpoint` is only required when configuring an S3 compatible service such as MinIO, or
when using an AWS S3 VPC Endpoint.
- `your-s3-bucket` should be the name of a bucket that exists, and can't include subdirectories.
- - `path_style` should be set to true to use `host/bucket_name/object` style paths instead of
+ - `pathstyle` should be set to true to use `host/bucket_name/object` style paths instead of
`bucket_name.host/object`. [Set to false for AWS S3](https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/).
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
@@ -484,6 +484,12 @@ storage:
#### Migrate to object storage without downtime
+WARNING:
+Using [AWS DataSync](https://aws.amazon.com/datasync/)
+to copy the registry data to or between S3 buckets creates invalid metadata objects in the bucket.
+For additional details, see [Tags with an empty name](#tags-with-an-empty-name).
+To move data to and between S3 buckets, the AWS CLI `sync` operation is recommended.
+
To migrate storage without stopping the Container Registry, set the Container Registry
to read-only mode. On large instances, this may require the Container Registry
to be in read-only mode for a while. During this time,
@@ -860,7 +866,7 @@ To remove image tags by running the cleanup policy, run the following commands i
# Numeric ID of the project whose container registry should be cleaned up
P = <project_id>
-# Numeric ID of a developer, maintainer or owner in that project
+# Numeric ID of a user with Developer, Maintainer, or Owner role for the project
U = <user_id>
# Get required details / objects
@@ -873,7 +879,7 @@ project.container_repositories.find_each do |repo|
puts repo.attributes
# Start the tag cleanup
- puts Projects::ContainerRepository::CleanupTagsService.new(project, user, policy.attributes.except("created_at", "updated_at")).execute(repo)
+ puts Projects::ContainerRepository::CleanupTagsService.new(repo, user, policy.attributes.except("created_at", "updated_at")).execute()
end
```
@@ -888,7 +894,7 @@ GitLab offers a set of APIs to manipulate the Container Registry and aid the pro
of removing unused tags. Currently, this is exposed using the API, but in the future,
these controls should migrate to the GitLab interface.
-Project maintainers can
+Users who have the [Maintainer role](../../user/permissions.md) for the project can
[delete Container Registry tags in bulk](../../api/container_registry.md#delete-registry-repository-tags-in-bulk)
periodically based on their own criteria, however, this alone does not recycle data,
it only unlinks tags from manifests and image blobs. To recycle the Container
@@ -1072,6 +1078,19 @@ PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
You may want to add the `-m` flag to [remove untagged manifests and unreferenced layers](#removing-untagged-manifests-and-unreferenced-layers).
+### Stop garbage collection
+
+If you anticipate stopping garbage collection, you should manually run garbage collection as
+described in [Performing garbage collection without downtime](#performing-garbage-collection-without-downtime).
+You can then stop garbage collection by pressing <kbd>Control</kbd>+<kbd>C</kbd>.
+
+Otherwise, interrupting `gitlab-ctl` could leave your registry service in a down state. In this
+case, you must find the [garbage collection process](https://gitlab.com/gitlab-org/omnibus-gitlab/-/blob/master/files/gitlab-ctl-commands/registry_garbage_collect.rb#L26-35)
+itself on the system so that the `gitlab-ctl` command can bring the registry service back up again.
+
+Also, there's no way to save progress or results during the mark phase of the process. Only once
+blobs start being deleted is anything permanent done.
+
## Configuring GitLab and Registry to run on separate nodes (Omnibus GitLab)
By default, package assumes that both services are running on the same node.
@@ -1080,28 +1099,28 @@ is necessary for Registry and GitLab.
### Configuring Registry
-Below you will find configuration options you should set in `/etc/gitlab/gitlab.rb`,
+Below you can find configuration options you should set in `/etc/gitlab/gitlab.rb`,
for Registry to run separately from GitLab:
- `registry['registry_http_addr']`, default [set programmatically](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/libraries/registry.rb#L50). Needs to be reachable by web server (or LB).
- `registry['token_realm']`, default [set programmatically](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/libraries/registry.rb#L53). Specifies the endpoint to use to perform authentication, usually the GitLab URL.
This endpoint needs to be reachable by user.
- `registry['http_secret']`, [random string](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/libraries/registry.rb#L32). A random piece of data used to sign state that may be stored with the client to protect against tampering.
-- `registry['internal_key']`, default [automatically generated](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/recipes/gitlab-rails.rb#L113-119). Contents of the key that GitLab uses to sign the tokens. They key gets created on the Registry server, but it won't be used there.
-- `gitlab_rails['registry_key_path']`, default [set programmatically](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/recipes/gitlab-rails.rb#L35). This is the path where `internal_key` contents will be written to disk.
+- `registry['internal_key']`, default [automatically generated](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/recipes/gitlab-rails.rb#L113-119). Contents of the key that GitLab uses to sign the tokens. They key gets created on the Registry server, but it is not used there.
+- `gitlab_rails['registry_key_path']`, default [set programmatically](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/recipes/gitlab-rails.rb#L35). This is the path where `internal_key` contents are written to disk.
- `registry['internal_certificate']`, default [automatically generated](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/registry/recipes/enable.rb#L60-66). Contents of the certificate that GitLab uses to sign the tokens.
- `registry['rootcertbundle']`, default [set programmatically](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/registry/recipes/enable.rb#L60). Path to certificate. This is the path where `internal_certificate`
- contents will be written to disk.
+ contents are written to disk.
- `registry['health_storagedriver_enabled']`, default [set programmatically](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-7-stable/files/gitlab-cookbooks/gitlab/libraries/registry.rb#L88). Configure whether health checks on the configured storage driver are enabled.
- `gitlab_rails['registry_issuer']`, [default value](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/attributes/default.rb#L153). This setting needs to be set the same between Registry and GitLab.
### Configuring GitLab
-Below you will find configuration options you should set in `/etc/gitlab/gitlab.rb`,
+Below you can find configuration options you should set in `/etc/gitlab/gitlab.rb`,
for GitLab to run separately from Registry:
-- `gitlab_rails['registry_enabled']`, must be set to `true`. This setting will
- signal to GitLab that it should allow Registry API requests.
+- `gitlab_rails['registry_enabled']`, must be set to `true`. This setting
+ signals to GitLab that it should allow Registry API requests.
- `gitlab_rails['registry_api_url']`, default [set programmatically](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/10-3-stable/files/gitlab-cookbooks/gitlab/libraries/registry.rb#L52). This is the Registry URL used internally that users do not need to interact with, `registry['registry_http_addr']` with scheme.
- `gitlab_rails['registry_host']`, eg. `registry.gitlab.example`. Registry endpoint without the scheme, the address that gets shown to the end user.
- `gitlab_rails['registry_port']`. Registry endpoint port, visible to the end user.
@@ -1257,7 +1276,7 @@ Check which files are in use:
enabled: true
host: gitlab.company.com
port: 4567
- api_url: http://127.0.0.1:5000 # internal address to the registry, will be used by GitLab to directly communicate with API
+ api_url: http://127.0.0.1:5000 # internal address to the registry, is used by GitLab to directly communicate with API
path: /var/opt/gitlab/gitlab-rails/shared/registry
--> key: /var/opt/gitlab/gitlab-rails/etc/gitlab-registry.key
issuer: omnibus-gitlab-issuer
@@ -1501,6 +1520,28 @@ The most straightforward option is to pull those images and push them once again
using a Docker client version above v1.12. Docker converts images automatically before pushing them
to the registry. Once done, all your v1 images should now be available as v2 images.
+### Tags with an empty name
+
+If using [AWS DataSync](https://aws.amazon.com/datasync/)
+to copy the registry data to or between S3 buckets, an empty metadata object is created in the root
+path of each container repository in the destination bucket. This causes the registry to interpret
+such files as a tag that appears with no name in the GitLab UI and API. For more information, see
+[this issue](https://gitlab.com/gitlab-org/container-registry/-/issues/341).
+
+To fix this you can do one of two things:
+
+- Use the AWS CLI [`rm`](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/rm.html)
+ command to remove the empty objects from the root of **each** affected repository. Pay special
+ attention to the trailing `/` and make sure **not** to use the `--recursive` option:
+
+ ```shell
+ aws s3 rm s3://<bucket>/docker/registry/v2/repositories/<path to repository>/
+ ```
+
+- Use the AWS CLI [`sync`](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/sync.html)
+ command to copy the registry data to a new bucket and configure the registry to use it. This
+ leaves the empty objects behind.
+
### Advanced Troubleshooting
We use a concrete example to illustrate how to
diff --git a/doc/administration/packages/index.md b/doc/administration/packages/index.md
index 90f2d9127fe..eea4964efbe 100644
--- a/doc/administration/packages/index.md
+++ b/doc/administration/packages/index.md
@@ -218,8 +218,8 @@ We recommend using the [consolidated object storage settings](../object_storage.
### Migrating local packages to object storage
-After [configuring the object storage](#using-object-storage), you may use the
-following task to migrate existing packages from the local storage to the remote one.
+After [configuring the object storage](#using-object-storage), use the following task to
+migrate existing packages from the local storage to the remote storage.
The processing is done in a background worker and requires **no downtime**.
For Omnibus GitLab:
@@ -234,11 +234,13 @@ For installations from source:
RAILS_ENV=production sudo -u git -H bundle exec rake gitlab:packages:migrate
```
-You can optionally track progress and verify that all packages migrated successfully.
+You can optionally track progress and verify that all packages migrated successfully using the
+[PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database):
-From the [PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database)
-(`sudo gitlab-psql -d gitlabhq_production` for Omnibus GitLab), verify that `objectstg` below (where
-`file_store=2`) has the count of all packages:
+- `sudo gitlab-rails dbconsole` for Omnibus GitLab instances.
+- `sudo -u git -H psql -d gitlabhq_production` for source-installed instances.
+
+Verify `objectstg` below (where `store=2`) has count of all packages:
```shell
gitlabhq_production=# SELECT count(*) AS total, sum(case when file_store = '1' then 1 else 0 end) AS filesystem, sum(case when file_store = '2' then 1 else 0 end) AS objectstg FROM packages_package_files;
@@ -247,3 +249,9 @@ total | filesystem | objectstg
------+------------+-----------
34 | 0 | 34
```
+
+Verify that there are no files on disk in the `packages` folder:
+
+```shell
+sudo find /var/opt/gitlab/gitlab-rails/shared/packages -type f | grep -v tmp | wc -l
+```
diff --git a/doc/administration/pages/index.md b/doc/administration/pages/index.md
index 163eb5388b6..f3ad474771c 100644
--- a/doc/administration/pages/index.md
+++ b/doc/administration/pages/index.md
@@ -56,11 +56,11 @@ Before proceeding with the Pages configuration, you must:
| `gitlab.example.com` | `pages.example.com` | **{check-circle}** Yes |
1. Configure a **wildcard DNS record**.
-1. (Optional) Have a **wildcard certificate** for that domain if you decide to
+1. Optional. Have a **wildcard certificate** for that domain if you decide to
serve Pages under HTTPS.
-1. (Optional but recommended) Enable [Shared runners](../../ci/runners/index.md)
+1. Optional but recommended. Enable [Shared runners](../../ci/runners/index.md)
so that your users don't have to bring their own.
-1. (Only for custom domains) Have a **secondary IP**.
+1. For custom domains, have a **secondary IP**.
NOTE:
If your GitLab instance and the Pages daemon are deployed in a private network or behind a firewall, your GitLab Pages websites are only accessible to devices/users that have access to the private network.
@@ -144,7 +144,8 @@ The Pages daemon doesn't listen to the outside world.
1. Set the external URL for GitLab Pages in `/etc/gitlab/gitlab.rb`:
```ruby
- pages_external_url 'http://example.io'
+ external_url "http://gitlab.example.com" # external_url here is only for reference
+ pages_external_url "http://pages.example.com" # not a subdomain of external_url
```
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
@@ -169,7 +170,8 @@ outside world.
1. In `/etc/gitlab/gitlab.rb` specify the following configuration:
```ruby
- pages_external_url 'https://example.io'
+ external_url "https://gitlab.example.com" # external_url here is only for reference
+ pages_external_url "https://pages.example.com" # not a subdomain of external_url
pages_nginx['redirect_http_to_https'] = true
```
@@ -256,7 +258,6 @@ control over how the Pages daemon runs and serves content in your environment.
| `pages_path` | The directory on disk where pages are stored, defaults to `GITLAB-RAILS/shared/pages`. |
| **`pages_nginx[]`** | |
| `enable` | Include a virtual host `server{}` block for Pages inside NGINX. Needed for NGINX to proxy traffic back to the Pages daemon. Set to `false` if the Pages daemon should directly receive all requests, for example, when using [custom domains](index.md#custom-domains). |
-| `FF_ENABLE_REDIRECTS` | Feature flag to enable/disable redirects (enabled by default). Read the [redirects documentation](../../user/project/pages/redirects.md#feature-flag-for-redirects) for more information. |
| `FF_ENABLE_PLACEHOLDERS` | Feature flag to enable/disable rewrites (disabled by default). Read the [redirects documentation](../../user/project/pages/redirects.md#feature-flag-for-rewrites) for more information. |
| `use_legacy_storage` | Temporarily-introduced parameter allowing to use legacy domain configuration source and storage. [Removed in 14.3](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/6166). |
| `rate_limit_source_ip` | Rate limit per source IP in number of requests per second. Set to `0` to disable this feature. |
@@ -288,7 +289,8 @@ world. Custom domains are supported, but no TLS.
1. In `/etc/gitlab/gitlab.rb` specify the following configuration:
```ruby
- pages_external_url "http://example.io"
+ external_url "http://gitlab.example.com" # external_url here is only for reference
+ pages_external_url "http://pages.example.com" # not a subdomain of external_url
nginx['listen_addresses'] = ['192.0.2.1'] # The primary IP of the GitLab instance
pages_nginx['enable'] = false
gitlab_pages['external_http'] = ['192.0.2.2:80', '[2001:db8::2]:80'] # The secondary IPs for the GitLab Pages daemon
@@ -318,7 +320,8 @@ world. Custom domains and TLS are supported.
1. In `/etc/gitlab/gitlab.rb` specify the following configuration:
```ruby
- pages_external_url "https://example.io"
+ external_url "https://gitlab.example.com" # external_url here is only for reference
+ pages_external_url "https://pages.example.com" # not a subdomain of external_url
nginx['listen_addresses'] = ['192.0.2.1'] # The primary IP of the GitLab instance
pages_nginx['enable'] = false
gitlab_pages['external_http'] = ['192.0.2.2:80', '[2001:db8::2]:80'] # The secondary IPs for the GitLab Pages daemon
@@ -795,7 +798,7 @@ Incorrect configuration of these values may result in intermittent
or persistent errors, or the Pages Daemon serving old content.
NOTE:
-Expiry, interval and timeout flags use [Golang's duration formatting](https://golang.org/pkg/time/#ParseDuration).
+Expiry, interval and timeout flags use [Golang's duration formatting](https://pkg.go.dev/time#ParseDuration).
A duration string is a possibly signed sequence of decimal numbers,
each with optional fraction and a unit suffix, such as `300ms`, `1.5h` or `2h45m`.
Valid time units are `ns`, `us` (or `µs`), `ms`, `s`, `m`, `h`.
@@ -1055,11 +1058,11 @@ Source-IP rate limits are enforced using the following:
gitlab_pages['rate_limit_source_ip_burst'] = 600
```
-1. To reject requests that exceed the specified limits, enable the `FF_ENABLE_RATE_LIMITER` feature flag in
+1. To reject requests that exceed the specified limits, enable the `FF_ENFORCE_IP_RATE_LIMITS` feature flag in
`/etc/gitlab/gitlab.rb`:
```ruby
- gitlab_pages['env'] = {'FF_ENABLE_RATE_LIMITER' => 'true'}
+ gitlab_pages['env'] = {'FF_ENFORCE_IP_RATE_LIMITS' => 'true'}
```
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
@@ -1281,8 +1284,8 @@ in all of your GitLab Pages instances.
### 500 error with `securecookie: failed to generate random iv` and `Failed to save the session`
-This problem most likely results from an [out-dated operating system](../package_information/deprecated_os.md).
-The [Pages daemon uses the `securecookie` library](https://gitlab.com/search?group_id=9970&project_id=734943&repository_ref=master&scope=blobs&search=securecookie&snippets=false) to get random strings via [`crypto/rand` in Go](https://golang.org/pkg/crypto/rand/#pkg-variables).
+This problem most likely results from an [out-dated operating system](../package_information/supported_os.md#os-versions-that-are-no-longer-supported).
+The [Pages daemon uses the `securecookie` library](https://gitlab.com/search?group_id=9970&project_id=734943&repository_ref=master&scope=blobs&search=securecookie&snippets=false) to get random strings via [`crypto/rand` in Go](https://pkg.go.dev/crypto/rand#pkg-variables).
This requires the `getrandom` system call or `/dev/urandom` to be available on the host OS.
Upgrading to an [officially supported operating system](https://about.gitlab.com/install/) is recommended.
diff --git a/doc/administration/pages/source.md b/doc/administration/pages/source.md
index 3a277204d21..45e9dadd1cf 100644
--- a/doc/administration/pages/source.md
+++ b/doc/administration/pages/source.md
@@ -59,9 +59,9 @@ Before proceeding with the Pages configuration, make sure that:
1. You have installed the `zip` and `unzip` packages in the same server that
GitLab is installed since they are needed to compress and decompress the
Pages artifacts.
-1. (Optional) You have a **wildcard certificate** for the Pages domain if you
+1. Optional. You have a **wildcard certificate** for the Pages domain if you
decide to serve Pages (`*.example.io`) under HTTPS.
-1. (Optional but recommended) You have configured and enabled the [shared runners](../../ci/runners/index.md)
+1. Optional but recommended. You have configured and enabled the [shared runners](../../ci/runners/index.md)
so that your users don't have to bring their own.
### DNS configuration
diff --git a/doc/administration/postgresql/database_load_balancing.md b/doc/administration/postgresql/database_load_balancing.md
new file mode 100644
index 00000000000..b83820dd0b6
--- /dev/null
+++ b/doc/administration/postgresql/database_load_balancing.md
@@ -0,0 +1,234 @@
+---
+stage: Enablement
+group: Database
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Database Load Balancing **(FREE SELF)**
+
+> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/1283) in [GitLab Premium](https://about.gitlab.com/pricing/) 9.0.
+> - [Moved](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/60894) from GitLab Premium to GitLab Free in 14.0.
+> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/334494) for Sidekiq in GitLab 14.1.
+
+With Database Load Balancing, read-only queries can be distributed across
+multiple PostgreSQL nodes to increase performance.
+
+This functionality is provided natively in GitLab Rails and Sidekiq where
+they can be configured to balance their database read queries in a round-robin approach,
+without any external dependencies:
+
+```plantuml
+@startuml
+card "**Internal Load Balancer**" as ilb #9370DB
+skinparam linetype ortho
+
+together {
+ collections "**GitLab Rails** x3" as gitlab #32CD32
+ collections "**Sidekiq** x4" as sidekiq #ff8dd1
+}
+
+collections "**Consul** x3" as consul #e76a9b
+
+card "Database" as database {
+ collections "**PGBouncer x3**\n//Consul//" as pgbouncer #4EA7FF
+
+ card "**PostgreSQL** //Primary//\n//Patroni//\n//PgBouncer//\n//Consul//" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** //Secondary// **x2**\n//Patroni//\n//PgBouncer//\n//Consul//" as postgres_secondary #4EA7FF
+
+ pgbouncer -[#4EA7FF]-> postgres_primary
+ postgres_primary .[#4EA7FF]r-> postgres_secondary
+}
+
+gitlab -[#32CD32]-> ilb
+gitlab -[hidden]-> pgbouncer
+gitlab .[#32CD32,norank]-> postgres_primary
+gitlab .[#32CD32,norank]-> postgres_secondary
+
+sidekiq -[#ff8dd1]-> ilb
+sidekiq -[hidden]-> pgbouncer
+sidekiq .[#ff8dd1,norank]-> postgres_primary
+sidekiq .[#ff8dd1,norank]-> postgres_secondary
+
+ilb -[#9370DB]-> pgbouncer
+
+consul -[#e76a9b]r-> pgbouncer
+consul .[#e76a9b,norank]r-> postgres_primary
+consul .[#e76a9b,norank]r-> postgres_secondary
+@enduml
+```
+
+## Requirements to enable Database Load Balancing
+
+To enable Database Load Balancing, make sure that:
+
+- The HA PostgreSQL setup has one or more secondary nodes replicating the primary.
+- Each PostgreSQL node is connected with the same credentials and on the same port.
+
+For Omnibus GitLab, you also need PgBouncer configured on each PostgreSQL node to pool
+all load-balanced connections when [configuring a multi-node setup](replication_and_failover.md).
+
+## Configuring Database Load Balancing
+
+Database Load Balancing can be configured in one of two ways:
+
+- (Recommended) [Hosts](#hosts): a list of PostgreSQL hosts.
+- [Service Discovery](#service-discovery): a DNS record that returns a list of PostgreSQL hosts.
+
+### Hosts
+
+To configure a list of hosts, add the `gitlab_rails['db_load_balancing']` setting into the
+`gitlab.rb` file in the GitLab Rails / Sidekiq nodes for each environment you want to balance.
+
+For example, on an environment that has PostgreSQL running on the hosts `host1.example.com`,
+`host2.example.com` and `host3.example.com` and reachable on the same port configured with
+`gitlab_rails['db_port']`:
+
+1. On each GitLab Rails / Sidekiq node, edit `/etc/gitlab/gitlab.rb` and add the following line:
+
+ ```ruby
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['host1.example.com', 'host2.example.com', `host3.example.com`] }
+ ```
+
+1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure).
+
+### Service Discovery
+
+> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/5883) in GitLab 11.0.
+
+Service discovery allows GitLab to automatically retrieve a list of PostgreSQL
+hosts to use. It periodically
+checks a DNS A record, using the IPs returned by this record as the addresses
+for the secondaries. For service discovery to work, all you need is a DNS server
+and an A record containing the IP addresses of your secondaries.
+
+When using Omnibus GitLab the provided [Consul](../consul.md) service works as
+a DNS server and returns PostgreSQL addresses via the `postgresql-ha.service.consul`
+record. For example:
+
+1. On each GitLab Rails / Sidekiq node, edit `/etc/gitlab/gitlab.rb` and add the following:
+
+ ```ruby
+ gitlab_rails['db_load_balancing'] = { 'discover' => {
+ 'nameserver' => 'localhost'
+ 'record' => 'postgresql-ha.service.consul'
+ 'record_type' => 'A'
+ 'port' => '8600'
+ 'interval' => '60'
+ 'disconnect_timeout' => '120'
+ }
+ }
+ ```
+
+1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
+
+| Option | Description | Default |
+|----------------------|---------------------------------------------------------------------------------------------------|-----------|
+| `nameserver` | The nameserver to use for looking up the DNS record. | localhost |
+| `record` | The record to look up. This option is required for service discovery to work. | |
+| `record_type` | Optional record type to look up, this can be either A or SRV (GitLab 12.3 and later) | A |
+| `port` | The port of the nameserver. | 8600 |
+| `interval` | The minimum time in seconds between checking the DNS record. | 60 |
+| `disconnect_timeout` | The time in seconds after which an old connection is closed, after the list of hosts was updated. | 120 |
+| `use_tcp` | Lookup DNS resources using TCP instead of UDP | false |
+
+If `record_type` is set to `SRV`, then GitLab continues to use round-robin algorithm
+and ignores the `weight` and `priority` in the record. Since SRV records usually
+return hostnames instead of IPs, GitLab needs to look for the IPs of returned hostnames
+in the additional section of the SRV response. If no IP is found for a hostname, GitLab
+needs to query the configured `nameserver` for ANY record for each such hostname looking for A or AAAA
+records, eventually dropping this hostname from rotation if it can't resolve its IP.
+
+The `interval` value specifies the _minimum_ time between checks. If the A
+record has a TTL greater than this value, then service discovery honors said
+TTL. For example, if the TTL of the A record is 90 seconds, then service
+discovery waits at least 90 seconds before checking the A record again.
+
+When the list of hosts is updated, it might take a while for the old connections
+to be terminated. The `disconnect_timeout` setting can be used to enforce an
+upper limit on the time it takes to terminate all old database connections.
+
+### Handling Stale Reads **(PREMIUM SELF)**
+
+> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/3526) in GitLab 10.3.
+
+To prevent reading from an outdated secondary the load balancer checks if it
+is in sync with the primary. If the data is recent enough, the
+secondary is used, otherwise it is ignored. To reduce the overhead of
+these checks we only perform them at certain intervals.
+
+There are three configuration options that influence this behavior:
+
+| Option | Description | Default |
+|------------------------------|----------------------------------------------------------------------------------------------------------------|------------|
+| `max_replication_difference` | The amount of data (in bytes) a secondary is allowed to lag behind when it hasn't replicated data for a while. | 8 MB |
+| `max_replication_lag_time` | The maximum number of seconds a secondary is allowed to lag behind before we stop using it. | 60 seconds |
+| `replica_check_interval` | The minimum number of seconds we have to wait before checking the status of a secondary. | 60 seconds |
+
+The defaults should be sufficient for most users.
+
+To configure these options with a hosts list, use the following example:
+
+```ruby
+gitlab_rails['db_load_balancing'] = {
+ 'hosts' => ['host1.example.com', 'host2.example.com', `host3.example.com`]
+ 'max_replication_difference' => 16777216 # 16 MB
+ 'max_replication_lag_time' => 30
+ 'replica_check_interval' => 30
+}
+```
+
+## Logging
+
+The load balancer logs various events in
+[`database_load_balancing.log`](../logs.md#database_load_balancinglog), such as
+
+- When a host is marked as offline
+- When a host comes back online
+- When all secondaries are offline
+- When a read is retried on a different host due to a query conflict
+
+The log is structured with each entry a JSON object containing at least:
+
+- An `event` field useful for filtering.
+- A human-readable `message` field.
+- Some event-specific metadata. For example, `db_host`
+- Contextual information that is always logged. For example, `severity` and `time`.
+
+For example:
+
+```json
+{"severity":"INFO","time":"2019-09-02T12:12:01.728Z","correlation_id":"abcdefg","event":"host_online","message":"Host came back online","db_host":"111.222.333.444","db_port":null,"tag":"rails.database_load_balancing","environment":"production","hostname":"web-example-1","fqdn":"gitlab.example.com","path":null,"params":null}
+```
+
+## Implementation Details
+
+### Balancing queries
+
+Read-only `SELECT` queries balance among all the given hosts.
+Everything else (including transactions) executes on the primary.
+Queries such as `SELECT ... FOR UPDATE` are also executed on the primary.
+
+### Prepared statements
+
+Prepared statements don't work well with load balancing and are disabled
+automatically when load balancing is enabled. This shouldn't impact
+response timings.
+
+### Primary sticking
+
+After a write has been performed, GitLab sticks to using the primary for a
+certain period of time, scoped to the user that performed the write. GitLab
+reverts back to using secondaries when they have either caught up, or after 30
+seconds.
+
+### Failover handling
+
+In the event of a failover or an unresponsive database, the load balancer
+tries to use the next available host. If no secondaries are available the
+operation is performed on the primary instead.
+
+If a connection error occurs while writing data, the
+operation retries up to 3 times using an exponential back-off.
+
+When using load balancing, you should be able to safely restart a database server
+without it immediately leading to errors being presented to the users.
diff --git a/doc/administration/postgresql/img/pg_ha_architecture.png b/doc/administration/postgresql/img/pg_ha_architecture.png
deleted file mode 100644
index 5d2a4a584bf..00000000000
--- a/doc/administration/postgresql/img/pg_ha_architecture.png
+++ /dev/null
Binary files differ
diff --git a/doc/administration/postgresql/pgbouncer.md b/doc/administration/postgresql/pgbouncer.md
index e5fef61540a..a666c1fab95 100644
--- a/doc/administration/postgresql/pgbouncer.md
+++ b/doc/administration/postgresql/pgbouncer.md
@@ -17,7 +17,7 @@ through `/etc/gitlab/gitlab.rb`.
## PgBouncer as part of a fault-tolerant GitLab installation
-This content has been moved to a [new location](replication_and_failover.md#configuring-the-pgbouncer-node).
+This content has been moved to a [new location](replication_and_failover.md#configure-pgbouncer-nodes).
## PgBouncer as part of a non-fault-tolerant GitLab installation
diff --git a/doc/administration/postgresql/replication_and_failover.md b/doc/administration/postgresql/replication_and_failover.md
index 01fe4bf64ba..5777f35bfcf 100644
--- a/doc/administration/postgresql/replication_and_failover.md
+++ b/doc/administration/postgresql/replication_and_failover.md
@@ -19,13 +19,54 @@ replication and failover for GitLab.
## Architecture
The Omnibus GitLab recommended configuration for a PostgreSQL cluster with
-replication and failover requires:
+replication failover requires:
+
+- A minimum of three PostgreSQL nodes.
+- A minimum of three Consul server nodes.
+- A minimum of three PgBouncer nodes that track and handle primary database reads and writes.
+ - An internal load balancer (TCP) to balance requests between the PgBouncer nodes.
+- [Database Load Balancing](database_load_balancing.md) enabled.
+ - A local PgBouncer service configured on each PostgreSQL node. Note that this is separate from the main PgBouncer cluster that tracks the primary.
+
+```plantuml
+@startuml
+card "**Internal Load Balancer**" as ilb #9370DB
+skinparam linetype ortho
+
+together {
+ collections "**GitLab Rails** x3" as gitlab #32CD32
+ collections "**Sidekiq** x4" as sidekiq #ff8dd1
+}
+
+collections "**Consul** x3" as consul #e76a9b
-- A minimum of three database nodes.
-- A minimum of three `Consul` server nodes.
-- A minimum of one `pgbouncer` service node, but it's recommended to have one per database node. An internal load balancer (TCP) is required when there is more than one `pgbouncer` service node.
+card "Database" as database {
+ collections "**PGBouncer x3**\n//Consul//" as pgbouncer #4EA7FF
+
+ card "**PostgreSQL** //Primary//\n//Patroni//\n//PgBouncer//\n//Consul//" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** //Secondary// **x2**\n//Patroni//\n//PgBouncer//\n//Consul//" as postgres_secondary #4EA7FF
+
+ pgbouncer -[#4EA7FF]-> postgres_primary
+ postgres_primary .[#4EA7FF]r-> postgres_secondary
+}
-![PostgreSQL HA Architecture](img/pg_ha_architecture.png)
+gitlab -[#32CD32]-> ilb
+gitlab -[hidden]-> pgbouncer
+gitlab .[#32CD32,norank]-> postgres_primary
+gitlab .[#32CD32,norank]-> postgres_secondary
+
+sidekiq -[#ff8dd1]-> ilb
+sidekiq -[hidden]-> pgbouncer
+sidekiq .[#ff8dd1,norank]-> postgres_primary
+sidekiq .[#ff8dd1,norank]-> postgres_secondary
+
+ilb -[#9370DB]-> pgbouncer
+
+consul -[#e76a9b]r-> pgbouncer
+consul .[#e76a9b,norank]r-> postgres_primary
+consul .[#e76a9b,norank]r-> postgres_secondary
+@enduml
+```
You also need to take into consideration the underlying network topology, making
sure you have redundant connectivity between all Database and GitLab instances
@@ -38,13 +79,14 @@ shipped with Omnibus GitLab, and thus Patroni becomes mandatory for replication
### Database node
-Each database node runs three services:
+Each database node runs four services:
- `PostgreSQL`: The database itself.
- `Patroni`: Communicates with other Patroni services in the cluster and handles failover when issues with the leader server occurs. The failover procedure consists of:
- Selecting a new leader for the cluster.
- Promoting the new node to leader.
- Instructing remaining servers to follow the new leader node.
+- `PgBouncer`: A local pooler for the node. Used for _read_ queries as part of [Database Load Balancing](database_load_balancing.md).
- `Consul` agent: To communicate with Consul cluster which stores the current Patroni state. The agent monitors the status of each node in the database cluster and tracks its health in a service definition on the Consul cluster.
### Consul server node
@@ -62,8 +104,26 @@ Each PgBouncer node runs two services:
Each service in the package comes with a set of [default ports](../package_information/defaults.md#ports). You may need to make specific firewall rules for the connections listed below:
+There are several connection flows in this setup:
+
+- [Primary](#primary)
+- [Database Load Balancing](#database-load-balancing)
+- [Replication](#replication)
+
+#### Primary
+
- Application servers connect to either PgBouncer directly via its [default port](../package_information/defaults.md) or via a configured Internal Load Balancer (TCP) that serves multiple PgBouncers.
-- PgBouncer connects to the primary database servers [PostgreSQL default port](../package_information/defaults.md)
+- PgBouncer connects to the primary database server's [PostgreSQL default port](../package_information/defaults.md).
+
+#### Database Load Balancing
+
+For read queries against data that haven't been recently changed and are up to date on all database nodes:
+
+- Application servers connect to the local PgBouncer service via its [default port](../package_information/defaults.md) on each database node in a round-robin approach.
+- Local PgBouncer connects to the local database server's [PostgreSQL default port](../package_information/defaults.md).
+
+#### Replication
+
- Patroni actively manages the running PostgreSQL processes and configuration.
- PostgreSQL secondaries connect to the primary database servers [PostgreSQL default port](../package_information/defaults.md)
- Consul servers and agents connect to each others [Consul default ports](../package_information/defaults.md)
@@ -203,8 +263,8 @@ repmgr-specific configuration as well. Especially, make sure that you remove `po
Here is an example:
```ruby
-# Disable all components except Patroni and Consul
-roles(['patroni_role'])
+# Disable all components except Patroni, PgBouncer and Consul
+roles(['patroni_role', 'pgbouncer_role'])
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
@@ -245,6 +305,15 @@ patroni['allowlist'] = %w(XXX.XXX.XXX.XXX/YY 127.0.0.1/32)
# Replace XXX.XXX.XXX.XXX/YY with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(XXX.XXX.XXX.XXX/YY 127.0.0.1/32)
+# Local PgBouncer service for Database Load Balancing
+pgbouncer['databases'] = {
+ gitlabhq_production: {
+ host: "127.0.0.1",
+ user: "PGBOUNCER_USERNAME",
+ password: 'PGBOUNCER_PASSWORD_HASH'
+ }
+}
+
# Replace placeholders:
#
# Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z
@@ -342,7 +411,7 @@ You can use different certificates and keys for both API server and client on di
However, the CA certificate (`patroni['tls_ca_file']`), TLS certificate verification (`patroni['tls_verify']`), and client TLS
authentication mode (`patroni['tls_client_mode']`), must each have the same value on all nodes.
-### Configuring the PgBouncer node
+### Configure PgBouncer nodes
1. Make sure you collect [`CONSUL_SERVER_NODES`](#consul-information), [`CONSUL_PASSWORD_HASH`](#consul-information), and [`PGBOUNCER_PASSWORD_HASH`](#pgbouncer-information) before executing the next step.
@@ -480,6 +549,7 @@ attributes set, but the following need to be set.
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = 'POSTGRESQL_USER_PASSWORD'
gitlab_rails['auto_migrate'] = false
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['POSTGRESQL_NODE_1', 'POSTGRESQL_NODE_2', 'POSTGRESQL_NODE_3'] }
```
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
@@ -595,8 +665,8 @@ An internal load balancer (TCP) is then required to be setup to serve each PgBou
On database nodes edit `/etc/gitlab/gitlab.rb`:
```ruby
-# Disable all components except Patroni and Consul
-roles(['patroni_role'])
+# Disable all components except Patroni, PgBouncer and Consul
+roles(['patroni_role', 'pgbouncer_role'])
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
@@ -616,6 +686,15 @@ patroni['postgresql']['max_wal_senders'] = 7
patroni['allowlist'] = = %w(10.6.0.0/16 127.0.0.1/32)
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16 127.0.0.1/32)
+# Local PgBouncer service for Database Load Balancing
+pgbouncer['databases'] = {
+ gitlabhq_production: {
+ host: "127.0.0.1",
+ user: "pgbouncer",
+ password: '771a8625958a529132abe6f1a4acb19c'
+ }
+}
+
# Configure the Consul agent
consul['services'] = %w(postgresql)
consul['configuration'] = {
@@ -650,115 +729,6 @@ After deploying the configuration follow these steps:
gitlab-rake gitlab:db:configure
```
-### Example minimal setup
-
-This example uses 3 PostgreSQL servers, and 1 application node (with PgBouncer setup alongside).
-
-It differs from the [recommended setup](#example-recommended-setup) by moving the Consul servers into the same servers we use for PostgreSQL.
-The trade-off is between reducing server counts, against the increased operational complexity of needing to deal with PostgreSQL [failover](#manual-failover-procedure-for-patroni) procedures in addition to [Consul outage recovery](../consul.md#outage-recovery) on the same set of machines.
-
-In this example, we start with all servers on the same 10.6.0.0/16 private network range; they can connect to each freely other on those addresses.
-
-Here is a list and description of each machine and the assigned IP:
-
-- `10.6.0.21`: PostgreSQL 1
-- `10.6.0.22`: PostgreSQL 2
-- `10.6.0.23`: PostgreSQL 3
-- `10.6.0.31`: GitLab application
-
-All passwords are set to `toomanysecrets`. Please do not use this password or derived hashes.
-
-The `external_url` for GitLab is `http://gitlab.example.com`
-
-After the initial configuration, if a failover occurs, the PostgresSQL leader node changes to one of the available secondaries until it is failed back.
-
-#### Example minimal configuration for database servers
-
-On database nodes edit `/etc/gitlab/gitlab.rb`:
-
-```ruby
-# Disable all components except Patroni and Consul
-roles(['patroni_role'])
-
-# PostgreSQL configuration
-postgresql['listen_address'] = '0.0.0.0'
-postgresql['hot_standby'] = 'on'
-postgresql['wal_level'] = 'replica'
-
-# Disable automatic database migrations
-gitlab_rails['auto_migrate'] = false
-
-# Configure the Consul agent
-consul['services'] = %w(postgresql)
-
-postgresql['pgbouncer_user_password'] = '771a8625958a529132abe6f1a4acb19c'
-postgresql['sql_user_password'] = '450409b85a0223a214b5fb1484f34d0f'
-
-# Sets `max_replication_slots` to double the number of database nodes.
-# Patroni uses one extra slot per node when initiating the replication.
-patroni['postgresql']['max_replication_slots'] = 6
-
-patroni['username'] = 'PATRONI_API_USERNAME'
-patroni['password'] = 'PATRONI_API_PASSWORD'
-
-# Set `max_wal_senders` to one more than the number of replication slots in the cluster.
-# This is used to prevent replication from using up all of the
-# available database connections.
-patroni['postgresql']['max_wal_senders'] = 7
-
-patroni['allowlist'] = = %w(10.6.0.0/16 127.0.0.1/32)
-postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16 127.0.0.1/32)
-
-consul['configuration'] = {
- server: true,
- retry_join: %w(10.6.0.21 10.6.0.22 10.6.0.23)
-}
-```
-
-[Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-
-#### Example minimal configuration for application server
-
-On the server edit `/etc/gitlab/gitlab.rb`:
-
-```ruby
-external_url 'http://gitlab.example.com'
-
-gitlab_rails['db_host'] = '127.0.0.1'
-gitlab_rails['db_port'] = 6432
-gitlab_rails['db_password'] = 'toomanysecrets'
-gitlab_rails['auto_migrate'] = false
-
-postgresql['enable'] = false
-pgbouncer['enable'] = true
-consul['enable'] = true
-
-# Configure PgBouncer
-pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
-
-# Configure Consul agent
-consul['watchers'] = %w(postgresql)
-
-pgbouncer['users'] = {
- 'gitlab-consul': {
- password: '5e0e3263571e3704ad655076301d6ebe'
- },
- 'pgbouncer': {
- password: '771a8625958a529132abe6f1a4acb19c'
- }
-}
-
-consul['configuration'] = {
- retry_join: %w(10.6.0.21 10.6.0.22 10.6.0.23)
-}
-```
-
-[Reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-
-#### Example minimal setup manual steps
-
-The manual steps for this configuration are the same as for the [example recommended setup](#example-recommended-setup-manual-steps).
-
## Patroni
NOTE:
@@ -791,7 +761,7 @@ Run `gitlab-ctl patroni members` to query Patroni for a summary of the cluster s
To verify the status of replication:
```shell
-echo 'select * from pg_stat_wal_receiver\x\g\x \n select * from pg_stat_replication\x\g\x' | gitlab-psql
+echo -e 'select * from pg_stat_wal_receiver\x\g\x \n select * from pg_stat_replication\x\g\x' | gitlab-psql
```
The same command can be run on all three database servers. It returns any information
@@ -1047,7 +1017,7 @@ Here are a few key facts that you must consider before upgrading PostgreSQL:
configured replication method (`pg_basebackup` is the only available option). It might take some
time for replica to catch up with the leader, depending on the size of your database.
-- An overview of the upgrade procedure is outlined in [Patoni's documentation](https://patroni.readthedocs.io/en/latest/existing_data.html#major-upgrade-of-postgresql-version).
+- An overview of the upgrade procedure is outlined in [Patroni's documentation](https://patroni.readthedocs.io/en/latest/existing_data.html#major-upgrade-of-postgresql-version).
You can still use `gitlab-ctl pg-upgrade` which implements this procedure with a few adjustments.
Considering these, you should carefully plan your PostgreSQL upgrade:
diff --git a/doc/administration/raketasks/maintenance.md b/doc/administration/raketasks/maintenance.md
index d770361864e..950b508ab0c 100644
--- a/doc/administration/raketasks/maintenance.md
+++ b/doc/administration/raketasks/maintenance.md
@@ -268,7 +268,7 @@ sudo -u git -H bundle exec rake gitlab:tcp_check[example.com,80] RAILS_ENV=produ
GitLab uses a shared lock mechanism: `ExclusiveLease` to prevent simultaneous operations
in a shared resource. An example is running periodic garbage collection on repositories.
-In very specific situations, a operation locked by an Exclusive Lease can fail without
+In very specific situations, an operation locked by an Exclusive Lease can fail without
releasing the lock. If you can't wait for it to expire, you can run this task to manually
clear it.
diff --git a/doc/administration/raketasks/storage.md b/doc/administration/raketasks/storage.md
index 017565e1b39..912cf260a03 100644
--- a/doc/administration/raketasks/storage.md
+++ b/doc/administration/raketasks/storage.md
@@ -13,7 +13,7 @@ uses to organize the Git data.
## List projects and attachments
-The following Rake tasks will list the projects and attachments that are
+The following Rake tasks lists the projects and attachments that are
available on legacy and hashed storage.
### On legacy storage
@@ -82,8 +82,8 @@ GitLab 14.0 eliminates support for legacy storage. If you're on GitLab
The option to choose between hashed and legacy storage in the admin area has
been disabled.
-This task must be run on any machine that has Rails/Sidekiq configured and will
-schedule all your existing projects and attachments associated with it to be
+This task must be run on any machine that has Rails/Sidekiq configured, and the task
+schedules all your existing projects and attachments associated with it to be
migrated to the **Hashed** storage type:
- **Omnibus installation**
@@ -112,7 +112,7 @@ To monitor the progress in GitLab:
1. On the top bar, select **Menu > Admin**.
1. On the left sidebar, select **Monitoring > Background Jobs**.
1. Watch how long the `hashed_storage:hashed_storage_project_migrate` queue
- will take to finish. After it reaches zero, you can confirm every project
+ takes to finish. After it reaches zero, you can confirm every project
has been migrated by running the commands above.
If you find it necessary, you can run the previous migration script again to schedule missing projects.
@@ -160,12 +160,12 @@ sudo gitlab-rake gitlab:storage:rollback_to_legacy ID_FROM=50 ID_TO=100
```
You can monitor the progress in the **Admin Area > Monitoring > Background Jobs** page.
-On the **Queues** tab, you can watch the `hashed_storage:hashed_storage_project_rollback` queue to see how long the process will take to finish.
+On the **Queues** tab, you can watch the `hashed_storage:hashed_storage_project_rollback` queue to see how long the process takes to finish.
After it reaches zero, you can confirm every project has been rolled back by running the commands above.
If some projects weren't rolled back, you can run this rollback script again to schedule further rollbacks.
Any error or warning is logged in Sidekiq's log file.
-If you have a Geo setup, the rollback will not be reflected automatically
+If you have a Geo setup, the rollback is not reflected automatically
on the **secondary** node. You may need to wait for a backfill operation to kick-in and remove
the remaining repositories from the special `@hashed/` folder manually.
diff --git a/doc/administration/raketasks/uploads/migrate.md b/doc/administration/raketasks/uploads/migrate.md
index 0628e351b63..aec75f0b302 100644
--- a/doc/administration/raketasks/uploads/migrate.md
+++ b/doc/administration/raketasks/uploads/migrate.md
@@ -42,6 +42,28 @@ gitlab-rake "gitlab:uploads:migrate:all"
sudo RAILS_ENV=production -u git -H bundle exec rake gitlab:uploads:migrate:all
```
+You can optionally track progress and verify that all packages migrated successfully using the
+[PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database):
+
+- `sudo gitlab-rails dbconsole` for Omnibus GitLab instances.
+- `sudo -u git -H psql -d gitlabhq_production` for source-installed instances.
+
+Verify `objectstg` below (where `store=2`) has count of all artifacts:
+
+```shell
+gitlabhq_production=# SELECT count(*) AS total, sum(case when store = '1' then 1 else 0 end) AS filesystem, sum(case when store = '2' then 1 else 0 end) AS objectstg FROM uploads;
+
+total | filesystem | objectstg
+------+------------+-----------
+ 2409 | 0 | 2409
+```
+
+Verify that there are no files on disk in the `uploads` folder:
+
+```shell
+sudo find /var/opt/gitlab/gitlab-rails/uploads -type f | grep -v tmp | wc -l
+```
+
### Individual Rake tasks
If you already ran the [all-in-one Rake task](#all-in-one-rake-task), there is no need to run these
diff --git a/doc/administration/read_only_gitlab.md b/doc/administration/read_only_gitlab.md
index 2fbcb2a62e7..b7e8397dd95 100644
--- a/doc/administration/read_only_gitlab.md
+++ b/doc/administration/read_only_gitlab.md
@@ -16,7 +16,7 @@ The configuration for doing so depends on your desired outcome.
## Make the repositories read-only
-The first thing you'll want to accomplish is to ensure that no changes can be
+The first thing you want to accomplish is to ensure that no changes can be
made to your repositories. There's two ways you can accomplish that:
- Either stop Puma to make the internal API unreachable:
@@ -46,7 +46,7 @@ made to your repositories. There's two ways you can accomplish that:
## Shut down the GitLab UI
If you don't mind shutting down the GitLab UI, then the easiest approach is to
-stop `sidekiq` and `puma`, and you'll effectively ensure that no
+stop `sidekiq` and `puma`, and you effectively ensure that no
changes can be made to GitLab:
```shell
@@ -63,7 +63,7 @@ sudo gitlab-ctl start puma
## Make the database read-only
-If you want to allow users to use the GitLab UI, then you'll need to ensure that
+If you want to allow users to use the GitLab UI, then you need to ensure that
the database is read-only:
1. Take a [GitLab backup](../raketasks/backup_restore.md)
@@ -113,7 +113,7 @@ the database is read-only:
sudo gitlab-ctl restart postgresql
```
-When you're ready to revert the read-only state, you'll need to remove the added
+When you're ready to revert the read-only state, you need to remove the added
lines in `/etc/gitlab/gitlab.rb`, and reconfigure GitLab and restart PostgreSQL:
```shell
diff --git a/doc/administration/redis/troubleshooting.md b/doc/administration/redis/troubleshooting.md
index 6ab3d55e06a..f4aab9d7b7f 100644
--- a/doc/administration/redis/troubleshooting.md
+++ b/doc/administration/redis/troubleshooting.md
@@ -20,6 +20,18 @@ Before proceeding with the troubleshooting below, check your firewall rules:
- Connect to other Sentinel machines via TCP in `26379`
- Connect to the Redis machines via TCP in `6379`
+## Basic Redis activity check
+
+Start Redis troubleshooting with a basic Redis activity check:
+
+1. Open a terminal on your GitLab server.
+1. Run `gitlab-redis-cli --stat` and observe the output while it runs.
+1. Go to your GitLab UI and browse to a handful of pages. Any page works, like
+ group or project overviews, issues, files in repositories, and so on.
+1. Check the `stat` output again and verify that the values for `keys`, `clients`,
+ `requests`, and `connections` increases as you browse. If the numbers go up,
+ basic Redis functionality is working and GitLab can connect to it.
+
## Troubleshooting Redis replication
You can check if everything is correct by connecting to each server using
diff --git a/doc/administration/reference_architectures/10k_users.md b/doc/administration/reference_architectures/10k_users.md
index 9c3c33e1fa8..fa8dfdf667b 100644
--- a/doc/administration/reference_architectures/10k_users.md
+++ b/doc/administration/reference_architectures/10k_users.md
@@ -12,6 +12,7 @@ full list of reference architectures, see
> - **Supported users (approximate):** 10,000
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
+> - **Estimated Costs:** [GCP](https://cloud.google.com/products/calculator#id=e77713f6-dc0b-4bb3-bcef-cea904ac8efd)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
> - **Performance tested daily with the [GitLab Performance Tool](https://gitlab.com/gitlab-org/quality/performance)**:
> - **Test requests per second (RPS) rates:** API: 200 RPS, Web: 20 RPS, Git (Pull): 20 RPS, Git (Push): 4 RPS
@@ -37,7 +38,7 @@ full list of reference architectures, see
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -49,6 +50,8 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 10k
+skinparam linetype ortho
+
card "**External Load Balancer**" as elb #6a9be7
card "**Internal Load Balancer**" as ilb #9370DB
@@ -73,8 +76,8 @@ card "Gitaly Cluster" as gitaly_cluster {
card "Database" as database {
collections "**PGBouncer** x3" as pgbouncer #4EA7FF
- card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
- collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+ card "**PostgreSQL** //Primary//" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** //Secondary// x2" as postgres_secondary #4EA7FF
pgbouncer -[#4EA7FF]-> postgres_primary
postgres_primary .[#4EA7FF]> postgres_secondary
@@ -83,31 +86,38 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
+
+ redis_cache -[hidden]-> redis_persistent
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]--> monitor
+elb -[#6a9be7,norank]--> monitor
-gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
+gitlab -[#32CD32,norank]--> ilb
+gitlab -[#32CD32]r-> object_storage
+gitlab -[#32CD32]----> redis
+gitlab .[#32CD32]----> database
gitlab -[hidden]-> monitor
gitlab -[hidden]-> consul
-sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
+sidekiq -[#ff8dd1,norank]--> ilb
+sidekiq -[#ff8dd1]r-> object_storage
+sidekiq -[#ff8dd1]----> redis
+sidekiq .[#ff8dd1]----> database
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden]--> redis
+ilb -[hidden]u-> consul
+ilb -[hidden]u-> monitor
consul .[#e76a9b]u-> gitlab
consul .[#e76a9b]u-> sidekiq
-consul .[#e76a9b]> monitor
+consul .[#e76a9b]r-> monitor
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
consul .[#e76a9b,norank]--> redis
@@ -124,21 +134,34 @@ monitor .[#7FFFD4,norank]u--> elb
@enduml
```
-The Google Cloud Platform (GCP) architectures were built and tested using the
+## Requirements
+
+Before starting, you should take note of the following requirements / guidance for this reference architecture.
+
+### Supported CPUs
+
+This reference architecture was built and tested on Google Cloud Platform (GCP) using the
[Intel Xeon E5 v3 (Haswell)](https://cloud.google.com/compute/docs/cpu-platforms)
CPU platform. On different hardware you may find that adjustments, either lower
or higher, are required for your CPU or node counts. For more information, see
our [Sysbench](https://github.com/akopytov/sysbench)-based
[CPU benchmarks](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Reference-Architectures/GCP-CPU-Benchmarks).
-Due to better performance and availability, for data objects (such as LFS,
-uploads, or artifacts), using an [object storage service](#configure-the-object-storage)
-is recommended.
+### Supported infrastructure
+
+As a general guidance, GitLab should run on most infrastructure such as reputable Cloud Providers (AWS, GCP, Azure) and their services, or self managed (ESXi) that meet both the specs detailed above, as well as any requirements in this section. However, this does not constitute a guarantee for every potential permutation.
+
+Be aware of the following specific call outs:
-It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+- [Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/#:~:text=Azure%20Database%20for%20PostgreSQL%20is,high%20availability%2C%20and%20dynamic%20scalability.) is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to known performance issues or missing features.
+- [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/) is recommended to be configured with [Premium accounts](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-block-blob-premium) to ensure consistent performance.
+
+### Praefect PostgreSQL
+
+It's worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
that to achieve full High Availability a third-party PostgreSQL database solution will be required.
We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
-can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398).
## Setup components
@@ -471,14 +494,15 @@ run: node-exporter: (pid 30093) 76833s; run: log: (pid 29663) 76855s
## Configure PostgreSQL
-In this section, you'll be guided through configuring an external PostgreSQL database
-to be used with GitLab.
+In this section, you'll be guided through configuring a highly available PostgreSQL
+cluster to be used with GitLab.
### Provide your own PostgreSQL instance
If you're hosting GitLab on a cloud provider, you can optionally use a
-managed service for PostgreSQL. For example, AWS offers a managed Relational
-Database Service (RDS) that runs PostgreSQL.
+managed service for PostgreSQL.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
If you use a cloud-managed service, or provide your own PostgreSQL:
@@ -488,12 +512,25 @@ If you use a cloud-managed service, or provide your own PostgreSQL:
needs privileges to create the `gitlabhq_production` database.
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring the GitLab Rails application](#configure-gitlab-rails).
+1. For improved performance, configuring [Database Load Balancing](../postgresql/database_load_balancing.md)
+ with multiple read replicas is recommended.
See [Configure GitLab using an external PostgreSQL service](../postgresql/external.md) for
further configuration steps.
### Standalone PostgreSQL using Omnibus GitLab
+The recommended Omnibus GitLab configuration for a PostgreSQL cluster with
+replication and failover requires:
+
+- A minimum of three PostgreSQL nodes.
+- A minimum of three Consul server nodes.
+- A minimum of three PgBouncer nodes that track and handle primary database reads and writes.
+ - An [internal load balancer](#configure-the-internal-load-balancer) (TCP) to balance requests between the PgBouncer nodes.
+- [Database Load Balancing](../postgresql/database_load_balancing.md) enabled.
+
+ A local PgBouncer service to be configured on each PostgreSQL node. Note that this is separate from the main PgBouncer cluster that tracks the primary.
+
The following IPs will be used as an example:
- `10.6.0.21`: PostgreSQL primary
@@ -548,8 +585,8 @@ in the second step, do not supply the `EXTERNAL_URL` value.
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
- # Disable all components except Patroni and Consul
- roles(['patroni_role'])
+ # Disable all components except Patroni, PgBouncer and Consul
+ roles(['patroni_role', 'pgbouncer_role'])
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
@@ -594,6 +631,15 @@ in the second step, do not supply the `EXTERNAL_URL` value.
# Replace 10.6.0.0/24 with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32)
+ # Local PgBouncer service for Database Load Balancing
+ pgbouncer['databases'] = {
+ gitlabhq_production: {
+ host: "127.0.0.1",
+ user: "pgbouncer",
+ password: '<pgbouncer_password_hash>'
+ }
+ }
+
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
@@ -654,9 +700,11 @@ If the 'State' column for any node doesn't say "running", check the
</a>
</div>
-## Configure PgBouncer
+### Configure PgBouncer
+
+Now that the PostgreSQL servers are all set up, let's configure PgBouncer
+for tracking and handling reads/writes to the primary database.
-Now that the PostgreSQL servers are all set up, let's configure PgBouncer.
The following IPs will be used as an example:
- `10.6.0.31`: PgBouncer 1
@@ -1216,6 +1264,15 @@ There are many third-party solutions for PostgreSQL HA. The solution selected mu
- A static IP for all connections that doesn't change on failover.
- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
+NOTE:
+With a third-party setup, it's possible to colocate Praefect's database on the same server as
+the main [GitLab](#provide-your-own-postgresql-instance) database as a convenience unless
+you are using Geo, where separate database instances are required for handling replication correctly.
+In this setup, the specs of the main database setup shouldn't need to be changed as the impact should be
+minimal.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
+
Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
@@ -1671,8 +1728,8 @@ To configure the Sidekiq nodes, on each one:
gitlab_rails['db_host'] = '10.6.0.40' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
- gitlab_rails['db_adapter'] = 'postgresql'
- gitlab_rails['db_encoding'] = 'unicode'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.21', '10.6.0.22', '10.6.0.23'] } # PostgreSQL IPs
+
## Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -1797,6 +1854,8 @@ On each node perform the following:
gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.21', '10.6.0.22', '10.6.0.23'] } # PostgreSQL IPs
+
# Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -2120,8 +2179,7 @@ cluster alongside your instance, read how to
## Configure NFS
[Object storage](#configure-the-object-storage), along with [Gitaly](#configure-gitaly)
-are recommended over NFS wherever possible for improved performance. If you intend
-to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pages-requires-nfs).
+are recommended over NFS wherever possible for improved performance.
See how to [configure NFS](../nfs.md).
@@ -2200,7 +2258,7 @@ services where applicable):
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -2212,6 +2270,7 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 10k
+skinparam linetype ortho
card "Kubernetes via Helm Charts" as kubernetes {
card "**External Load Balancer**" as elb #6a9be7
@@ -2221,7 +2280,6 @@ card "Kubernetes via Helm Charts" as kubernetes {
collections "**Sidekiq** x4" as sidekiq #ff8dd1
}
- card "**Prometheus + Grafana**" as monitor #7FFFD4
card "**Supporting Services**" as support
}
@@ -2249,37 +2307,33 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
+
+ redis_cache -[hidden]-> redis_persistent
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]-> monitor
+elb -[hidden]-> sidekiq
elb -[hidden]-> support
gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
-gitlab -[hidden]--> consul
+gitlab -[#32CD32]r--> object_storage
+gitlab -[#32CD32,norank]----> redis
+gitlab -[#32CD32]----> database
sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
-sidekiq -[hidden]--> consul
-
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+sidekiq -[#ff8dd1]r--> object_storage
+sidekiq -[#ff8dd1,norank]----> redis
+sidekiq .[#ff8dd1]----> database
-consul .[#e76a9b]-> database
-consul .[#e76a9b]-> gitaly_cluster
-consul .[#e76a9b,norank]--> redis
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden,norank]--> redis
-monitor .[#7FFFD4]> consul
-monitor .[#7FFFD4]-> database
-monitor .[#7FFFD4]-> gitaly_cluster
-monitor .[#7FFFD4,norank]--> redis
-monitor .[#7FFFD4]> ilb
-monitor .[#7FFFD4,norank]u--> elb
+consul .[#e76a9b]--> database
+consul .[#e76a9b,norank]--> gitaly_cluster
+consul .[#e76a9b]--> redis
@enduml
```
diff --git a/doc/administration/reference_architectures/1k_users.md b/doc/administration/reference_architectures/1k_users.md
index 5488d8d33a6..ed6fbe84a48 100644
--- a/doc/administration/reference_architectures/1k_users.md
+++ b/doc/administration/reference_architectures/1k_users.md
@@ -29,13 +29,64 @@ many organizations.
| Up to 500 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` |
| Up to 1,000 | 8 vCPU, 7.2 GB memory | `n1-highcpu-8` | `c5.2xlarge` | `F8s v2` |
-The Google Cloud Platform (GCP) architectures were built and tested using the
+```plantuml
+@startuml 1k
+card "**Prometheus + Grafana**" as monitor #7FFFD4
+package "GitLab Single Server" as gitlab-single-server {
+together {
+ card "**GitLab Rails**" as gitlab #32CD32
+ card "**Gitaly**" as gitaly #FF8C00
+ card "**PostgreSQL**" as postgres #4EA7FF
+ card "**Redis**" as redis #FF6347
+ card "**Sidekiq**" as sidekiq #ff8dd1
+}
+card "Local Storage" as local_storage #white
+}
+
+gitlab -[#32CD32]--> gitaly
+gitlab -[#32CD32]--> postgres
+gitlab -[#32CD32]--> redis
+gitlab -[#32CD32]--> sidekiq
+gitaly -[#32CD32]--> local_storage
+postgres -[#32CD32]--> local_storage
+sidekiq -[#32CD32]--> local_storage
+gitlab -[#32CD32]--> local_storage
+
+monitor .[#7FFFD4]u-> gitlab
+monitor .[#7FFFD4]u-> sidekiq
+monitor .[#7FFFD4]-> postgres
+monitor .[#7FFFD4]-> gitaly
+monitor .[#7FFFD4,norank]--> redis
+
+@enduml
+```
+
+The diagram above shows that while GitLab can be installed on a single server, it is internally composed of multiple services. As a GitLab instance is scaled, each of these services are broken out and independently scaled according to the demands placed on them. In some cases PaaS can be leveraged for some services (e.g. Cloud Object Storage for some file systems). For the sake of redundancy some of the services become clusters of nodes storing the same data. In a horizontal configuration of GitLab there are various ancillary services required to coordinate clusters or discover of resources (e.g. PgBouncer for Postgres connection management, Consul for Prometheus end point discovery).
+
+## Requirements
+
+Before starting, you should take note of the following requirements / guidance for this reference architecture.
+
+### Supported CPUs
+
+This reference architecture was built and tested on Google Cloud Platform (GCP) using the
[Intel Xeon E5 v3 (Haswell)](https://cloud.google.com/compute/docs/cpu-platforms)
CPU platform. On different hardware you may find that adjustments, either lower
or higher, are required for your CPU or node counts. For more information, see
our [Sysbench](https://github.com/akopytov/sysbench)-based
[CPU benchmarks](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Reference-Architectures/GCP-CPU-Benchmarks).
+### Supported infrastructure
+
+As a general guidance, GitLab should run on most infrastructure such as reputable Cloud Providers (AWS, GCP, Azure) and their services, or self managed (ESXi) that meet both the specs detailed above, as well as any requirements in this section. However, this does not constitute a guarantee for every potential permutation.
+
+Be aware of the following specific call outs:
+
+- [Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/#:~:text=Azure%20Database%20for%20PostgreSQL%20is,high%20availability%2C%20and%20dynamic%20scalability.) is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to known performance issues or missing features.
+- [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/) is recommended to be configured with [Premium accounts](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-block-blob-premium) to ensure consistent performance.
+
+### Swap
+
In addition to the stated configurations, we recommend having at least 2 GB of
swap on your server, even if you currently have enough available memory. Having
swap helps to reduce the chance of errors occurring if your available memory
diff --git a/doc/administration/reference_architectures/25k_users.md b/doc/administration/reference_architectures/25k_users.md
index 25cafbe667b..24b3350bd75 100644
--- a/doc/administration/reference_architectures/25k_users.md
+++ b/doc/administration/reference_architectures/25k_users.md
@@ -12,6 +12,7 @@ full list of reference architectures, see
> - **Supported users (approximate):** 25,000
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
+> - **Estimated Costs:** [GCP](https://cloud.google.com/products/calculator#id=925386e1-c01c-4c0a-8d7d-ebde1824b7b0)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
> - **Performance tested weekly with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
> - **Test requests per second (RPS) rates:** API: 500 RPS, Web: 50 RPS, Git (Pull): 50 RPS, Git (Push): 10 RPS
@@ -37,7 +38,7 @@ full list of reference architectures, see
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -49,6 +50,8 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 25k
+skinparam linetype ortho
+
card "**External Load Balancer**" as elb #6a9be7
card "**Internal Load Balancer**" as ilb #9370DB
@@ -73,8 +76,8 @@ card "Gitaly Cluster" as gitaly_cluster {
card "Database" as database {
collections "**PGBouncer** x3" as pgbouncer #4EA7FF
- card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
- collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+ card "**PostgreSQL** //Primary//" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** //Secondary// x2" as postgres_secondary #4EA7FF
pgbouncer -[#4EA7FF]-> postgres_primary
postgres_primary .[#4EA7FF]> postgres_secondary
@@ -83,31 +86,38 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
+
+ redis_cache -[hidden]-> redis_persistent
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]--> monitor
+elb -[#6a9be7,norank]--> monitor
-gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
+gitlab -[#32CD32,norank]--> ilb
+gitlab -[#32CD32]r-> object_storage
+gitlab -[#32CD32]----> redis
+gitlab .[#32CD32]----> database
gitlab -[hidden]-> monitor
gitlab -[hidden]-> consul
-sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
+sidekiq -[#ff8dd1,norank]--> ilb
+sidekiq -[#ff8dd1]r-> object_storage
+sidekiq -[#ff8dd1]----> redis
+sidekiq .[#ff8dd1]----> database
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden]--> redis
+ilb -[hidden]u-> consul
+ilb -[hidden]u-> monitor
consul .[#e76a9b]u-> gitlab
consul .[#e76a9b]u-> sidekiq
-consul .[#e76a9b]> monitor
+consul .[#e76a9b]r-> monitor
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
consul .[#e76a9b,norank]--> redis
@@ -124,21 +134,34 @@ monitor .[#7FFFD4,norank]u--> elb
@enduml
```
-The Google Cloud Platform (GCP) architectures were built and tested using the
+## Requirements
+
+Before starting, you should take note of the following requirements / guidance for this reference architecture.
+
+### Supported CPUs
+
+This reference architecture was built and tested on Google Cloud Platform (GCP) using the
[Intel Xeon E5 v3 (Haswell)](https://cloud.google.com/compute/docs/cpu-platforms)
CPU platform. On different hardware you may find that adjustments, either lower
or higher, are required for your CPU or node counts. For more information, see
our [Sysbench](https://github.com/akopytov/sysbench)-based
[CPU benchmarks](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Reference-Architectures/GCP-CPU-Benchmarks).
-Due to better performance and availability, for data objects (such as LFS,
-uploads, or artifacts), using an [object storage service](#configure-the-object-storage)
-is recommended.
+### Supported infrastructure
+
+As a general guidance, GitLab should run on most infrastructure such as reputable Cloud Providers (AWS, GCP, Azure) and their services, or self managed (ESXi) that meet both the specs detailed above, as well as any requirements in this section. However, this does not constitute a guarantee for every potential permutation.
-It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+Be aware of the following specific call outs:
+
+- [Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/#:~:text=Azure%20Database%20for%20PostgreSQL%20is,high%20availability%2C%20and%20dynamic%20scalability.) is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to known performance issues or missing features.
+- [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/) is recommended to be configured with [Premium accounts](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-block-blob-premium) to ensure consistent performance.
+
+### Praefect PostgreSQL
+
+It's worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
that to achieve full High Availability a third-party PostgreSQL database solution will be required.
We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
-can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398).
## Setup components
@@ -474,14 +497,15 @@ run: node-exporter: (pid 30093) 76833s; run: log: (pid 29663) 76855s
## Configure PostgreSQL
-In this section, you'll be guided through configuring an external PostgreSQL database
-to be used with GitLab.
+In this section, you'll be guided through configuring a highly available PostgreSQL
+cluster to be used with GitLab.
### Provide your own PostgreSQL instance
If you're hosting GitLab on a cloud provider, you can optionally use a
-managed service for PostgreSQL. For example, AWS offers a managed Relational
-Database Service (RDS) that runs PostgreSQL.
+managed service for PostgreSQL.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
If you use a cloud-managed service, or provide your own PostgreSQL:
@@ -491,12 +515,25 @@ If you use a cloud-managed service, or provide your own PostgreSQL:
needs privileges to create the `gitlabhq_production` database.
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring the GitLab Rails application](#configure-gitlab-rails).
+1. For improved performance, configuring [Database Load Balancing](../postgresql/database_load_balancing.md)
+ with multiple read replicas is recommended.
See [Configure GitLab using an external PostgreSQL service](../postgresql/external.md) for
further configuration steps.
### Standalone PostgreSQL using Omnibus GitLab
+The recommended Omnibus GitLab configuration for a PostgreSQL cluster with
+replication and failover requires:
+
+- A minimum of three PostgreSQL nodes.
+- A minimum of three Consul server nodes.
+- A minimum of three PgBouncer nodes that track and handle primary database reads and writes.
+ - An [internal load balancer](#configure-the-internal-load-balancer) (TCP) to balance requests between the PgBouncer nodes.
+- [Database Load Balancing](../postgresql/database_load_balancing.md) enabled.
+
+ A local PgBouncer service to be configured on each PostgreSQL node. Note that this is separate from the main PgBouncer cluster that tracks the primary.
+
The following IPs will be used as an example:
- `10.6.0.21`: PostgreSQL primary
@@ -551,8 +588,8 @@ in the second step, do not supply the `EXTERNAL_URL` value.
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
- # Disable all components except Patroni and Consul
- roles(['patroni_role'])
+ # Disable all components except Patroni, PgBouncer and Consul
+ roles(['patroni_role', 'pgbouncer_role'])
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
@@ -597,6 +634,15 @@ in the second step, do not supply the `EXTERNAL_URL` value.
# Replace 10.6.0.0/24 with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32)
+ # Local PgBouncer service for Database Load Balancing
+ pgbouncer['databases'] = {
+ gitlabhq_production: {
+ host: "127.0.0.1",
+ user: "pgbouncer",
+ password: '<pgbouncer_password_hash>'
+ }
+ }
+
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
@@ -657,9 +703,11 @@ If the 'State' column for any node doesn't say "running", check the
</a>
</div>
-## Configure PgBouncer
+### Configure PgBouncer
+
+Now that the PostgreSQL servers are all set up, let's configure PgBouncer
+for tracking and handling reads/writes to the primary database.
-Now that the PostgreSQL servers are all set up, let's configure PgBouncer.
The following IPs will be used as an example:
- `10.6.0.31`: PgBouncer 1
@@ -1222,7 +1270,14 @@ There are many third-party solutions for PostgreSQL HA. The solution selected mu
- A static IP for all connections that doesn't change on failover.
- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
-Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
+NOTE:
+With a third-party setup, it's possible to colocate Praefect's database on the same server as
+the main [GitLab](#provide-your-own-postgresql-instance) database as a convenience unless
+you are using Geo, where separate database instances are required for handling replication correctly.
+In this setup, the specs of the main database setup shouldn't need to be changed as the impact should be
+minimal.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
@@ -1677,8 +1732,8 @@ To configure the Sidekiq nodes, on each one:
gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
- gitlab_rails['db_adapter'] = 'postgresql'
- gitlab_rails['db_encoding'] = 'unicode'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.21', '10.6.0.22', '10.6.0.23'] } # PostgreSQL IPs
+
## Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -1805,6 +1860,8 @@ On each node perform the following:
gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.21', '10.6.0.22', '10.6.0.23'] } # PostgreSQL IPs
+
# Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -2126,8 +2183,7 @@ cluster alongside your instance, read how to
## Configure NFS
[Object storage](#configure-the-object-storage), along with [Gitaly](#configure-gitaly)
-are recommended over NFS wherever possible for improved performance. If you intend
-to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pages-requires-nfs).
+are recommended over NFS wherever possible for improved performance.
See how to [configure NFS](../nfs.md).
@@ -2200,7 +2256,7 @@ services where applicable):
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -2212,16 +2268,16 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 25k
+skinparam linetype ortho
card "Kubernetes via Helm Charts" as kubernetes {
card "**External Load Balancer**" as elb #6a9be7
together {
- collections "**Webservice** x7" as gitlab #32CD32
+ collections "**Webservice** x4" as gitlab #32CD32
collections "**Sidekiq** x4" as sidekiq #ff8dd1
}
- card "**Prometheus + Grafana**" as monitor #7FFFD4
card "**Supporting Services**" as support
}
@@ -2249,37 +2305,33 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
+
+ redis_cache -[hidden]-> redis_persistent
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]-> monitor
+elb -[hidden]-> sidekiq
elb -[hidden]-> support
gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
-gitlab -[hidden]--> consul
+gitlab -[#32CD32]r--> object_storage
+gitlab -[#32CD32,norank]----> redis
+gitlab -[#32CD32]----> database
sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
-sidekiq -[hidden]--> consul
-
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+sidekiq -[#ff8dd1]r--> object_storage
+sidekiq -[#ff8dd1,norank]----> redis
+sidekiq .[#ff8dd1]----> database
-consul .[#e76a9b]-> database
-consul .[#e76a9b]-> gitaly_cluster
-consul .[#e76a9b,norank]--> redis
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden,norank]--> redis
-monitor .[#7FFFD4]> consul
-monitor .[#7FFFD4]-> database
-monitor .[#7FFFD4]-> gitaly_cluster
-monitor .[#7FFFD4,norank]--> redis
-monitor .[#7FFFD4]> ilb
-monitor .[#7FFFD4,norank]u--> elb
+consul .[#e76a9b]--> database
+consul .[#e76a9b,norank]--> gitaly_cluster
+consul .[#e76a9b]--> redis
@enduml
```
diff --git a/doc/administration/reference_architectures/2k_users.md b/doc/administration/reference_architectures/2k_users.md
index e619294704f..f72c0877ddb 100644
--- a/doc/administration/reference_architectures/2k_users.md
+++ b/doc/administration/reference_architectures/2k_users.md
@@ -13,6 +13,7 @@ For a full list of reference architectures, see
> - **Supported users (approximate):** 2,000
> - **High Availability:** No. For a highly-available environment, you can
> follow a modified [3K reference architecture](3k_users.md#supported-modifications-for-lower-user-counts-ha).
+> - **Estimated Costs:** [GCP](https://cloud.google.com/products/calculator#id=84d11491-d72a-493c-a16e-650931faa658)
> - **Cloud Native Hybrid:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
> - **Performance tested daily with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
> - **Test requests per second (RPS) rates:** API: 40 RPS, Web: 4 RPS, Git (Pull): 4 RPS, Git (Push): 1 RPS
@@ -27,10 +28,10 @@ For a full list of reference architectures, see
| GitLab Rails | 2 | 8 vCPU, 7.2 GB memory | `n1-highcpu-8` | `c5.2xlarge` | `F8s v2` |
| Monitoring node | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` | `F2s v2` |
| Object storage<sup>4</sup> | n/a | n/a | n/a | n/a | n/a |
-| NFS server (optional, not recommended) | 1 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` |
+| NFS server (non-Gitaly) | 1 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` |
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run as reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run as reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -41,6 +42,8 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 2k
+skinparam linetype ortho
+
card "**External Load Balancer**" as elb #6a9be7
collections "**GitLab Rails** x3" as gitlab #32CD32
@@ -67,17 +70,27 @@ monitor .[#7FFFD4,norank]u--> elb
@enduml
```
-The Google Cloud Platform (GCP) architectures were built and tested using the
+## Requirements
+
+Before starting, you should take note of the following requirements / guidance for this reference architecture.
+
+### Supported CPUs
+
+This reference architecture was built and tested on Google Cloud Platform (GCP) using the
[Intel Xeon E5 v3 (Haswell)](https://cloud.google.com/compute/docs/cpu-platforms)
CPU platform. On different hardware you may find that adjustments, either lower
or higher, are required for your CPU or node counts. For more information, see
our [Sysbench](https://github.com/akopytov/sysbench)-based
[CPU benchmarks](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Reference-Architectures/GCP-CPU-Benchmarks).
-Due to better performance and availability, for data objects (such as LFS,
-uploads, or artifacts), using an [object storage service](#configure-the-object-storage)
-is recommended instead of using NFS. Using an object storage service also
-doesn't require you to provision and maintain a node.
+### Supported infrastructure
+
+As a general guidance, GitLab should run on most infrastructure such as reputable Cloud Providers (AWS, GCP, Azure) and their services, or self managed (ESXi) that meet both the specs detailed above, as well as any requirements in this section. However, this does not constitute a guarantee for every potential permutation.
+
+Be aware of the following specific call outs:
+
+- [Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/#:~:text=Azure%20Database%20for%20PostgreSQL%20is,high%20availability%2C%20and%20dynamic%20scalability.) is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to known performance issues or missing features.
+- [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/) is recommended to be configured with [Premium accounts](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-block-blob-premium) to ensure consistent performance.
## Setup components
@@ -100,8 +113,7 @@ To set up GitLab and its components to accommodate up to 2,000 users:
more advanced code search across your entire GitLab instance.
1. [Configure NFS](#configure-nfs-optional) (optional, and not recommended)
to have shared disk storage service as an alternative to Gitaly or object
- storage. You can skip this step if you're not using GitLab Pages (which
- requires NFS).
+ storage.
## Configure the external load balancer
@@ -232,8 +244,9 @@ to be used with GitLab.
### Provide your own PostgreSQL instance
If you're hosting GitLab on a cloud provider, you can optionally use a
-managed service for PostgreSQL. For example, AWS offers a managed relational
-database service (RDS) that runs PostgreSQL.
+managed service for PostgreSQL.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
If you use a cloud-managed service, or provide your own PostgreSQL:
@@ -958,8 +971,7 @@ cluster alongside your instance, read how to
For improved performance, [object storage](#configure-the-object-storage),
along with [Gitaly](#configure-gitaly), are recommended over using NFS whenever
-possible. However, if you intend to use GitLab Pages,
-[you must use NFS](troubleshooting.md#gitlab-pages-requires-nfs).
+possible.
See how to [configure NFS](../nfs.md).
@@ -1028,7 +1040,7 @@ services where applicable):
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
<!-- markdownlint-enable MD029 -->
@@ -1038,6 +1050,7 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 2k
+skinparam linetype ortho
card "Kubernetes via Helm Charts" as kubernetes {
card "**External Load Balancer**" as elb #6a9be7
@@ -1045,10 +1058,8 @@ card "Kubernetes via Helm Charts" as kubernetes {
together {
collections "**Webservice** x3" as gitlab #32CD32
collections "**Sidekiq** x2" as sidekiq #ff8dd1
+ card "**Supporting Services**" as support
}
-
- card "**Prometheus + Grafana**" as monitor #7FFFD4
- card "**Supporting Services**" as support
}
card "**Gitaly**" as gitaly #FF8C00
@@ -1057,7 +1068,6 @@ card "**Redis**" as redis #FF6347
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]--> monitor
gitlab -[#32CD32]--> gitaly
gitlab -[#32CD32]--> postgres
@@ -1066,14 +1076,8 @@ gitlab -[#32CD32]--> redis
sidekiq -[#ff8dd1]--> gitaly
sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> postgres
-sidekiq -[#ff8dd1]---> redis
-
-monitor .[#7FFFD4]u-> gitlab
-monitor .[#7FFFD4]-> gitaly
-monitor .[#7FFFD4]-> postgres
-monitor .[#7FFFD4,norank]--> redis
-monitor .[#7FFFD4,norank]u--> elb
+sidekiq -[#ff8dd1]--> postgres
+sidekiq -[#ff8dd1]--> redis
@enduml
```
diff --git a/doc/administration/reference_architectures/3k_users.md b/doc/administration/reference_architectures/3k_users.md
index 9332ae8d271..c788a73753b 100644
--- a/doc/administration/reference_architectures/3k_users.md
+++ b/doc/administration/reference_architectures/3k_users.md
@@ -22,6 +22,7 @@ For a full list of reference architectures, see
> - **Supported users (approximate):** 3,000
> - **High Availability:** Yes, although [Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution
+> - **Estimated Costs:** [GCP](https://cloud.google.com/products/calculator/#id=ac4838e6-9c40-4a36-ac43-6d1bc1843e08)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
> - **Performance tested weekly with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
> - **Test requests per second (RPS) rates:** API: 60 RPS, Web: 6 RPS, Git (Pull): 6 RPS, Git (Push): 1 RPS
@@ -42,11 +43,11 @@ For a full list of reference architectures, see
| GitLab Rails | 3 | 8 vCPU, 7.2 GB memory | `n1-highcpu-8` | `c5.2xlarge` | `F8s v2` |
| Monitoring node | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` | `F2s v2` |
| Object storage<sup>4</sup> | n/a | n/a | n/a | n/a | n/a |
-| NFS server (optional, not recommended) | 1 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` |
+| NFS server (non-Gitaly) | 1 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` |
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -58,6 +59,8 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 3k
+skinparam linetype ortho
+
card "**External Load Balancer**" as elb #6a9be7
card "**Internal Load Balancer**" as ilb #9370DB
@@ -66,7 +69,10 @@ together {
collections "**Sidekiq** x4" as sidekiq #ff8dd1
}
-card "**Prometheus + Grafana**" as monitor #7FFFD4
+together {
+ card "**Prometheus + Grafana**" as monitor #7FFFD4
+ collections "**Consul** x3" as consul #e76a9b
+}
card "Gitaly Cluster" as gitaly_cluster {
collections "**Praefect** x3" as praefect #FF8C00
@@ -79,47 +85,45 @@ card "Gitaly Cluster" as gitaly_cluster {
card "Database" as database {
collections "**PGBouncer** x3" as pgbouncer #4EA7FF
- card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
- collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+ card "**PostgreSQL** //Primary//" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** //Secondary// x2" as postgres_secondary #4EA7FF
pgbouncer -[#4EA7FF]-> postgres_primary
postgres_primary .[#4EA7FF]> postgres_secondary
}
-card "**Consul + Sentinel**" as consul_sentinel {
- collections "**Consul** x3" as consul #e76a9b
- collections "**Redis Sentinel** x3" as sentinel #e6e727
-}
-
card "Redis" as redis {
collections "**Redis** x3" as redis_nodes #FF6347
-
- redis_nodes <.[#FF6347]- sentinel
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]--> monitor
+elb -[#6a9be7,norank]--> monitor
-gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
+gitlab -[#32CD32,norank]--> ilb
+gitlab -[#32CD32]r-> object_storage
+gitlab -[#32CD32]----> redis
+gitlab .[#32CD32]----> database
gitlab -[hidden]-> monitor
gitlab -[hidden]-> consul
-sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
+sidekiq -[#ff8dd1,norank]--> ilb
+sidekiq -[#ff8dd1]r-> object_storage
+sidekiq -[#ff8dd1]----> redis
+sidekiq .[#ff8dd1]----> database
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden]--> redis
+ilb -[hidden]u-> consul
+ilb -[hidden]u-> monitor
consul .[#e76a9b]u-> gitlab
consul .[#e76a9b]u-> sidekiq
-consul .[#e76a9b]> monitor
+consul .[#e76a9b]r-> monitor
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
consul .[#e76a9b,norank]--> redis
@@ -136,27 +140,34 @@ monitor .[#7FFFD4,norank]u--> elb
@enduml
```
-The Google Cloud Platform (GCP) architectures were built and tested using the
+## Requirements
+
+Before starting, you should take note of the following requirements / guidance for this reference architecture.
+
+### Supported CPUs
+
+This reference architecture was built and tested on Google Cloud Platform (GCP) using the
[Intel Xeon E5 v3 (Haswell)](https://cloud.google.com/compute/docs/cpu-platforms)
CPU platform. On different hardware you may find that adjustments, either lower
or higher, are required for your CPU or node counts. For more information, see
our [Sysbench](https://github.com/akopytov/sysbench)-based
[CPU benchmarks](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Reference-Architectures/GCP-CPU-Benchmarks).
-Due to better performance and availability, for data objects (such as LFS,
-uploads, or artifacts), using an [object storage service](#configure-the-object-storage)
-is recommended instead of using NFS. Using an object storage service also
-doesn't require you to provision and maintain a node.
+### Supported infrastructure
+
+As a general guidance, GitLab should run on most infrastructure such as reputable Cloud Providers (AWS, GCP, Azure) and their services, or self managed (ESXi) that meet both the specs detailed above, as well as any requirements in this section. However, this does not constitute a guarantee for every potential permutation.
-[Praefect requires its own database server](../gitaly/praefect.md#postgresql),
-and a third-party PostgreSQL database solution is required to achieve full
-high availability. Although we hope to offer a built-in solution for these
-restrictions in the future, you can set up a non-HA PostgreSQL server by using
-Omnibus GitLab (which the previous specifications reflect). Refer to the
-following issues for more information:
+Be aware of the following specific call outs:
-- [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919)
-- [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+- [Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/#:~:text=Azure%20Database%20for%20PostgreSQL%20is,high%20availability%2C%20and%20dynamic%20scalability.) is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to known performance issues or missing features.
+- [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/) is recommended to be configured with [Premium accounts](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-block-blob-premium) to ensure consistent performance.
+
+### Praefect PostgreSQL
+
+It's worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+that to achieve full High Availability a third-party PostgreSQL database solution will be required.
+We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398).
## Setup components
@@ -184,8 +195,7 @@ To set up GitLab and its components to accommodate up to 3,000 users:
more advanced code search across your entire GitLab instance.
1. [Configure NFS](#configure-nfs-optional) (optional, and not recommended)
to have shared disk storage service as an alternative to Gitaly or object
- storage. You can skip this step if you're not using GitLab Pages (which
- requires NFS).
+ storage.
The servers start on the same 10.6.0.0/24 private network range, and can
connect to each other freely on these addresses.
@@ -769,14 +779,15 @@ run: sentinel: (pid 30098) 76832s; run: log: (pid 29704) 76850s
## Configure PostgreSQL
-In this section, you'll be guided through configuring an external PostgreSQL database
-to be used with GitLab.
+In this section, you'll be guided through configuring a highly available PostgreSQL
+cluster to be used with GitLab.
### Provide your own PostgreSQL instance
If you're hosting GitLab on a cloud provider, you can optionally use a
-managed service for PostgreSQL. For example, AWS offers a managed Relational
-Database Service (RDS) that runs PostgreSQL.
+managed service for PostgreSQL.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
If you use a cloud-managed service, or provide your own PostgreSQL:
@@ -786,12 +797,25 @@ If you use a cloud-managed service, or provide your own PostgreSQL:
needs privileges to create the `gitlabhq_production` database.
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring the GitLab Rails application](#configure-gitlab-rails).
+1. For improved performance, configuring [Database Load Balancing](../postgresql/database_load_balancing.md)
+ with multiple read replicas is recommended.
See [Configure GitLab using an external PostgreSQL service](../postgresql/external.md) for
further configuration steps.
### Standalone PostgreSQL using Omnibus GitLab
+The recommended Omnibus GitLab configuration for a PostgreSQL cluster with
+replication and failover requires:
+
+- A minimum of three PostgreSQL nodes.
+- A minimum of three Consul server nodes.
+- A minimum of three PgBouncer nodes that track and handle primary database reads and writes.
+ - An [internal load balancer](#configure-the-internal-load-balancer) (TCP) to balance requests between the PgBouncer nodes.
+- [Database Load Balancing](../postgresql/database_load_balancing.md) enabled.
+
+ A local PgBouncer service to be configured on each PostgreSQL node. Note that this is separate from the main PgBouncer cluster that tracks the primary.
+
The following IPs will be used as an example:
- `10.6.0.31`: PostgreSQL primary
@@ -846,8 +870,8 @@ in the second step, do not supply the `EXTERNAL_URL` value.
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
- # Disable all components except Patroni and Consul
- roles(['patroni_role'])
+ # Disable all components except Patroni, PgBouncer and Consul
+ roles(['patroni_role', 'pgbouncer_role'])
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
@@ -892,6 +916,15 @@ in the second step, do not supply the `EXTERNAL_URL` value.
# Replace 10.6.0.0/24 with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32)
+ # Local PgBouncer service for Database Load Balancing
+ pgbouncer['databases'] = {
+ gitlabhq_production: {
+ host: "127.0.0.1",
+ user: "pgbouncer",
+ password: '<pgbouncer_password_hash>'
+ }
+ }
+
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
@@ -952,9 +985,11 @@ If the 'State' column for any node doesn't say "running", check the
</a>
</div>
-## Configure PgBouncer
+### Configure PgBouncer
+
+Now that the PostgreSQL servers are all set up, let's configure PgBouncer
+for tracking and handling reads/writes to the primary database.
-Now that the PostgreSQL servers are all set up, let's configure PgBouncer.
The following IPs will be used as an example:
- `10.6.0.21`: PgBouncer 1
@@ -1175,7 +1210,14 @@ There are many third-party solutions for PostgreSQL HA. The solution selected mu
- A static IP for all connections that doesn't change on failover.
- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
-Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
+NOTE:
+With a third-party setup, it's possible to colocate Praefect's database on the same server as
+the main [GitLab](#provide-your-own-postgresql-instance) database as a convenience unless
+you are using Geo, where separate database instances are required for handling replication correctly.
+In this setup, the specs of the main database setup shouldn't need to be changed as the impact should be
+minimal.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
@@ -1613,8 +1655,8 @@ To configure the Sidekiq nodes, one each one:
gitlab_rails['db_host'] = '10.6.0.40' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
- gitlab_rails['db_adapter'] = 'postgresql'
- gitlab_rails['db_encoding'] = 'unicode'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.31', '10.6.0.32', '10.6.0.33'] } # PostgreSQL IPs
+
## Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -1773,6 +1815,8 @@ On each node perform the following:
gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.31', '10.6.0.32', '10.6.0.33'] } # PostgreSQL IPs
+
# Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -2074,8 +2118,7 @@ cluster alongside your instance, read how to
## Configure NFS (optional)
[Object storage](#configure-the-object-storage), along with [Gitaly](#configure-gitaly)
-are recommended over NFS wherever possible for improved performance. If you intend
-to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pages-requires-nfs).
+are recommended over NFS wherever possible for improved performance.
See how to [configure NFS](../nfs.md).
@@ -2102,7 +2145,7 @@ but with smaller performance requirements, several modifications can be consider
- Lowering node specs: Depending on your user count, you can lower all suggested node specs as desired. However, it's recommended that you don't go lower than the [general requirements](../../install/requirements.md).
- Combining select nodes: Some nodes can be combined to reduce complexity at the cost of some performance:
- GitLab Rails and Sidekiq: Sidekiq nodes can be removed and the component instead enabled on the GitLab Rails nodes.
- - PostgreSQL and PgBouncer: PgBouncer nodes can be removed and the component instead enabled on PostgreSQL with the Internal Load Balancer pointing to them instead.
+ - PostgreSQL and PgBouncer: PgBouncer nodes could be removed and instead be enabled on PostgreSQL nodes with the Internal Load Balancer pointing to them. However, to enable [Database Load Balancing](../postgresql/database_load_balancing.md), a separate PgBouncer array is still required.
- Reducing the node counts: Some node types do not need consensus and can run with fewer nodes (but more than one for redundancy). This will also lead to reduced performance.
- GitLab Rails and Sidekiq: Stateless services don't have a minimum node count. Two are enough for redundancy.
- Gitaly and Praefect: A quorum is not strictly necessary. Two Gitaly nodes and two Praefect nodes are enough for redundancy.
@@ -2171,7 +2214,7 @@ services where applicable):
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -2183,25 +2226,21 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 3k
+skinparam linetype ortho
card "Kubernetes via Helm Charts" as kubernetes {
card "**External Load Balancer**" as elb #6a9be7
together {
- collections "**Webservice** x2" as gitlab #32CD32
- collections "**Sidekiq** x3" as sidekiq #ff8dd1
+ collections "**Webservice** x4" as gitlab #32CD32
+ collections "**Sidekiq** x4" as sidekiq #ff8dd1
}
- card "**Prometheus + Grafana**" as monitor #7FFFD4
card "**Supporting Services**" as support
}
card "**Internal Load Balancer**" as ilb #9370DB
-
-card "**Consul + Sentinel**" as consul_sentinel {
- collections "**Consul** x3" as consul #e76a9b
- collections "**Redis Sentinel** x3" as sentinel #e6e727
-}
+collections "**Consul** x3" as consul #e76a9b
card "Gitaly Cluster" as gitaly_cluster {
collections "**Praefect** x3" as praefect #FF8C00
@@ -2221,41 +2260,33 @@ card "Database" as database {
postgres_primary .[#4EA7FF]> postgres_secondary
}
-card "Redis" as redis {
+card "redis" as redis {
collections "**Redis** x3" as redis_nodes #FF6347
-
- redis_nodes <.[#FF6347]- sentinel
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]-> monitor
+elb -[hidden]-> sidekiq
elb -[hidden]-> support
gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
-gitlab -[hidden]--> consul
+gitlab -[#32CD32]r--> object_storage
+gitlab -[#32CD32,norank]----> redis
+gitlab -[#32CD32]----> database
sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
-sidekiq -[hidden]--> consul
-
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+sidekiq -[#ff8dd1]r--> object_storage
+sidekiq -[#ff8dd1,norank]----> redis
+sidekiq .[#ff8dd1]----> database
-consul .[#e76a9b]-> database
-consul .[#e76a9b]-> gitaly_cluster
-consul .[#e76a9b,norank]--> redis
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden,norank]--> redis
-monitor .[#7FFFD4]> consul
-monitor .[#7FFFD4]-> database
-monitor .[#7FFFD4]-> gitaly_cluster
-monitor .[#7FFFD4,norank]--> redis
-monitor .[#7FFFD4]> ilb
-monitor .[#7FFFD4,norank]u--> elb
+consul .[#e76a9b]--> database
+consul .[#e76a9b,norank]--> gitaly_cluster
+consul .[#e76a9b]--> redis
@enduml
```
diff --git a/doc/administration/reference_architectures/50k_users.md b/doc/administration/reference_architectures/50k_users.md
index bbdf798d9ad..4f576fc1c19 100644
--- a/doc/administration/reference_architectures/50k_users.md
+++ b/doc/administration/reference_architectures/50k_users.md
@@ -12,6 +12,7 @@ full list of reference architectures, see
> - **Supported users (approximate):** 50,000
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
+> - **Estimated Costs:** [GCP](https://cloud.google.com/products/calculator/#id=8006396b-88ee-40cd-a1c8-77cdefa4d3c8)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
> - **Performance tested weekly with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
> - **Test requests per second (RPS) rates:** API: 1000 RPS, Web: 100 RPS, Git (Pull): 100 RPS, Git (Push): 20 RPS
@@ -37,7 +38,7 @@ full list of reference architectures, see
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -49,6 +50,8 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 50k
+skinparam linetype ortho
+
card "**External Load Balancer**" as elb #6a9be7
card "**Internal Load Balancer**" as ilb #9370DB
@@ -73,8 +76,8 @@ card "Gitaly Cluster" as gitaly_cluster {
card "Database" as database {
collections "**PGBouncer** x3" as pgbouncer #4EA7FF
- card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
- collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+ card "**PostgreSQL** //Primary//" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** //Secondary// x2" as postgres_secondary #4EA7FF
pgbouncer -[#4EA7FF]-> postgres_primary
postgres_primary .[#4EA7FF]> postgres_secondary
@@ -83,31 +86,38 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
+
+ redis_cache -[hidden]-> redis_persistent
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]--> monitor
+elb -[#6a9be7,norank]--> monitor
-gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
+gitlab -[#32CD32,norank]--> ilb
+gitlab -[#32CD32]r-> object_storage
+gitlab -[#32CD32]----> redis
+gitlab .[#32CD32]----> database
gitlab -[hidden]-> monitor
gitlab -[hidden]-> consul
-sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
+sidekiq -[#ff8dd1,norank]--> ilb
+sidekiq -[#ff8dd1]r-> object_storage
+sidekiq -[#ff8dd1]----> redis
+sidekiq .[#ff8dd1]----> database
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden]--> redis
+ilb -[hidden]u-> consul
+ilb -[hidden]u-> monitor
consul .[#e76a9b]u-> gitlab
consul .[#e76a9b]u-> sidekiq
-consul .[#e76a9b]> monitor
+consul .[#e76a9b]r-> monitor
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
consul .[#e76a9b,norank]--> redis
@@ -124,21 +134,34 @@ monitor .[#7FFFD4,norank]u--> elb
@enduml
```
-The Google Cloud Platform (GCP) architectures were built and tested using the
+## Requirements
+
+Before starting, you should take note of the following requirements / guidance for this reference architecture.
+
+### Supported CPUs
+
+This reference architecture was built and tested on Google Cloud Platform (GCP) using the
[Intel Xeon E5 v3 (Haswell)](https://cloud.google.com/compute/docs/cpu-platforms)
CPU platform. On different hardware you may find that adjustments, either lower
or higher, are required for your CPU or node counts. For more information, see
our [Sysbench](https://github.com/akopytov/sysbench)-based
[CPU benchmarks](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Reference-Architectures/GCP-CPU-Benchmarks).
-Due to better performance and availability, for data objects (such as LFS,
-uploads, or artifacts), using an [object storage service](#configure-the-object-storage)
-is recommended.
+### Supported infrastructure
+
+As a general guidance, GitLab should run on most infrastructure such as reputable Cloud Providers (AWS, GCP, Azure) and their services, or self managed (ESXi) that meet both the specs detailed above, as well as any requirements in this section. However, this does not constitute a guarantee for every potential permutation.
+
+Be aware of the following specific call outs:
-It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+- [Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/#:~:text=Azure%20Database%20for%20PostgreSQL%20is,high%20availability%2C%20and%20dynamic%20scalability.) is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to known performance issues or missing features.
+- [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/) is recommended to be configured with [Premium accounts](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-block-blob-premium) to ensure consistent performance.
+
+### Praefect PostgreSQL
+
+It's worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
that to achieve full High Availability a third-party PostgreSQL database solution will be required.
We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
-can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398).
## Setup components
@@ -480,14 +503,15 @@ run: node-exporter: (pid 30093) 76833s; run: log: (pid 29663) 76855s
## Configure PostgreSQL
-In this section, you'll be guided through configuring an external PostgreSQL database
-to be used with GitLab.
+In this section, you'll be guided through configuring a highly available PostgreSQL
+cluster to be used with GitLab.
### Provide your own PostgreSQL instance
If you're hosting GitLab on a cloud provider, you can optionally use a
-managed service for PostgreSQL. For example, AWS offers a managed Relational
-Database Service (RDS) that runs PostgreSQL.
+managed service for PostgreSQL.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
If you use a cloud-managed service, or provide your own PostgreSQL:
@@ -497,12 +521,25 @@ If you use a cloud-managed service, or provide your own PostgreSQL:
needs privileges to create the `gitlabhq_production` database.
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring the GitLab Rails application](#configure-gitlab-rails).
+1. For improved performance, configuring [Database Load Balancing](../postgresql/database_load_balancing.md)
+ with multiple read replicas is recommended.
See [Configure GitLab using an external PostgreSQL service](../postgresql/external.md) for
further configuration steps.
### Standalone PostgreSQL using Omnibus GitLab
+The recommended Omnibus GitLab configuration for a PostgreSQL cluster with
+replication and failover requires:
+
+- A minimum of three PostgreSQL nodes.
+- A minimum of three Consul server nodes.
+- A minimum of three PgBouncer nodes that track and handle primary database reads and writes.
+ - An [internal load balancer](#configure-the-internal-load-balancer) (TCP) to balance requests between the PgBouncer nodes.
+- [Database Load Balancing](../postgresql/database_load_balancing.md) enabled.
+
+ A local PgBouncer service to be configured on each PostgreSQL node. Note that this is separate from the main PgBouncer cluster that tracks the primary.
+
The following IPs will be used as an example:
- `10.6.0.21`: PostgreSQL primary
@@ -557,8 +594,8 @@ in the second step, do not supply the `EXTERNAL_URL` value.
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
- # Disable all components except Patroni and Consul
- roles(['patroni_role'])
+ # Disable all components except Patroni, PgBouncer and Consul
+ roles(['patroni_role', 'pgbouncer_role'])
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
@@ -604,6 +641,15 @@ in the second step, do not supply the `EXTERNAL_URL` value.
# Replace 10.6.0.0/24 with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32)
+ # Local PgBouncer service for Database Load Balancing
+ pgbouncer['databases'] = {
+ gitlabhq_production: {
+ host: "127.0.0.1",
+ user: "pgbouncer",
+ password: '<pgbouncer_password_hash>'
+ }
+ }
+
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
@@ -664,9 +710,11 @@ If the 'State' column for any node doesn't say "running", check the
</a>
</div>
-## Configure PgBouncer
+### Configure PgBouncer
+
+Now that the PostgreSQL servers are all set up, let's configure PgBouncer
+for tracking and handling reads/writes to the primary database.
-Now that the PostgreSQL servers are all set up, let's configure PgBouncer.
The following IPs will be used as an example:
- `10.6.0.31`: PgBouncer 1
@@ -891,7 +939,7 @@ a node and change its status from primary to replica (and vice versa).
package of your choice. Be sure to both follow _only_ installation steps 1 and 2
on the page, and to select the correct Omnibus GitLab package, with the same version
and type (Community or Enterprise editions) as your current install.
-1. Edit `/etc/gitlab/gitlab.rb` and add the same contents as the priimary node in the previous section by replacing `redis_master_node` with `redis_replica_node`:
+1. Edit `/etc/gitlab/gitlab.rb` and add the same contents as the primary node in the previous section by replacing `redis_master_node` with `redis_replica_node`:
```ruby
# Specify server role as 'redis_replica_role' with Sentinel and enable Consul agent
@@ -1229,6 +1277,15 @@ There are many third-party solutions for PostgreSQL HA. The solution selected mu
- A static IP for all connections that doesn't change on failover.
- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
+NOTE:
+With a third-party setup, it's possible to colocate Praefect's database on the same server as
+the main [GitLab](#provide-your-own-postgresql-instance) database as a convenience unless
+you are using Geo, where separate database instances are required for handling replication correctly.
+In this setup, the specs of the main database setup shouldn't need to be changed as the impact should be
+minimal.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
+
Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
@@ -1684,8 +1741,8 @@ To configure the Sidekiq nodes, on each one:
gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
- gitlab_rails['db_adapter'] = 'postgresql'
- gitlab_rails['db_encoding'] = 'unicode'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.21', '10.6.0.22', '10.6.0.23'] } # PostgreSQL IPs
+
## Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -1819,6 +1876,8 @@ On each node perform the following:
gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.21', '10.6.0.22', '10.6.0.23'] } # PostgreSQL IPs
+
# Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -2140,8 +2199,7 @@ cluster alongside your instance, read how to
## Configure NFS
[Object storage](#configure-the-object-storage), along with [Gitaly](#configure-gitaly)
-are recommended over NFS wherever possible for improved performance. If you intend
-to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pages-requires-nfs).
+are recommended over NFS wherever possible for improved performance.
See how to [configure NFS](../nfs.md).
@@ -2214,7 +2272,7 @@ services where applicable):
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and Amazon RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -2226,16 +2284,16 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 50k
+skinparam linetype ortho
card "Kubernetes via Helm Charts" as kubernetes {
card "**External Load Balancer**" as elb #6a9be7
together {
- collections "**Webservice** x16" as gitlab #32CD32
+ collections "**Webservice** x4" as gitlab #32CD32
collections "**Sidekiq** x4" as sidekiq #ff8dd1
}
- card "**Prometheus + Grafana**" as monitor #7FFFD4
card "**Supporting Services**" as support
}
@@ -2263,37 +2321,33 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
+
+ redis_cache -[hidden]-> redis_persistent
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]-> monitor
+elb -[hidden]-> sidekiq
elb -[hidden]-> support
gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
-gitlab -[hidden]--> consul
+gitlab -[#32CD32]r--> object_storage
+gitlab -[#32CD32,norank]----> redis
+gitlab -[#32CD32]----> database
sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
-sidekiq -[hidden]--> consul
-
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+sidekiq -[#ff8dd1]r--> object_storage
+sidekiq -[#ff8dd1,norank]----> redis
+sidekiq .[#ff8dd1]----> database
-consul .[#e76a9b]-> database
-consul .[#e76a9b]-> gitaly_cluster
-consul .[#e76a9b,norank]--> redis
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden,norank]--> redis
-monitor .[#7FFFD4]> consul
-monitor .[#7FFFD4]-> database
-monitor .[#7FFFD4]-> gitaly_cluster
-monitor .[#7FFFD4,norank]--> redis
-monitor .[#7FFFD4]> ilb
-monitor .[#7FFFD4,norank]u--> elb
+consul .[#e76a9b]--> database
+consul .[#e76a9b,norank]--> gitaly_cluster
+consul .[#e76a9b]--> redis
@enduml
```
diff --git a/doc/administration/reference_architectures/5k_users.md b/doc/administration/reference_architectures/5k_users.md
index a1921f50e4e..92950806cfb 100644
--- a/doc/administration/reference_architectures/5k_users.md
+++ b/doc/administration/reference_architectures/5k_users.md
@@ -19,6 +19,7 @@ costly-to-operate environment by using the
> - **Supported users (approximate):** 5,000
> - **High Availability:** Yes ([Praefect](#configure-praefect-postgresql) needs a third-party PostgreSQL solution for HA)
+> - **Estimated Costs:** [GCP](https://cloud.google.com/products/calculator/#id=8742e8ea-c08f-4e0a-b058-02f3a1c38a2f)
> - **Cloud Native Hybrid Alternative:** [Yes](#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative)
> - **Performance tested weekly with the [GitLab Performance Tool (GPT)](https://gitlab.com/gitlab-org/quality/performance)**:
> - **Test requests per second (RPS) rates:** API: 100 RPS, Web: 10 RPS, Git (Pull): 10 RPS, Git (Push): 2 RPS
@@ -39,11 +40,11 @@ costly-to-operate environment by using the
| GitLab Rails | 3 | 16 vCPU, 14.4 GB memory | `n1-highcpu-16` | `c5.4xlarge` | `F16s v2`|
| Monitoring node | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` | `F2s v2` |
| Object storage<sup>4</sup> | n/a | n/a | n/a | n/a | n/a |
-| NFS server (optional, not recommended) | 1 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` |
+| NFS server (non-Gitaly) | 1 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` |
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and AWS RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -55,6 +56,8 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 5k
+skinparam linetype ortho
+
card "**External Load Balancer**" as elb #6a9be7
card "**Internal Load Balancer**" as ilb #9370DB
@@ -63,7 +66,10 @@ together {
collections "**Sidekiq** x4" as sidekiq #ff8dd1
}
-card "**Prometheus + Grafana**" as monitor #7FFFD4
+together {
+ card "**Prometheus + Grafana**" as monitor #7FFFD4
+ collections "**Consul** x3" as consul #e76a9b
+}
card "Gitaly Cluster" as gitaly_cluster {
collections "**Praefect** x3" as praefect #FF8C00
@@ -76,47 +82,45 @@ card "Gitaly Cluster" as gitaly_cluster {
card "Database" as database {
collections "**PGBouncer** x3" as pgbouncer #4EA7FF
- card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF
- collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF
+ card "**PostgreSQL** //Primary//" as postgres_primary #4EA7FF
+ collections "**PostgreSQL** //Secondary// x2" as postgres_secondary #4EA7FF
pgbouncer -[#4EA7FF]-> postgres_primary
postgres_primary .[#4EA7FF]> postgres_secondary
}
-card "**Consul + Sentinel**" as consul_sentinel {
- collections "**Consul** x3" as consul #e76a9b
- collections "**Redis Sentinel** x3" as sentinel #e6e727
-}
-
card "Redis" as redis {
collections "**Redis** x3" as redis_nodes #FF6347
-
- redis_nodes <.[#FF6347]- sentinel
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]--> monitor
+elb -[#6a9be7,norank]--> monitor
-gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
+gitlab -[#32CD32,norank]--> ilb
+gitlab -[#32CD32]r-> object_storage
+gitlab -[#32CD32]----> redis
+gitlab .[#32CD32]----> database
gitlab -[hidden]-> monitor
gitlab -[hidden]-> consul
-sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
+sidekiq -[#ff8dd1,norank]--> ilb
+sidekiq -[#ff8dd1]r-> object_storage
+sidekiq -[#ff8dd1]----> redis
+sidekiq .[#ff8dd1]----> database
sidekiq -[hidden]-> monitor
sidekiq -[hidden]-> consul
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden]--> redis
+ilb -[hidden]u-> consul
+ilb -[hidden]u-> monitor
consul .[#e76a9b]u-> gitlab
consul .[#e76a9b]u-> sidekiq
-consul .[#e76a9b]> monitor
+consul .[#e76a9b]r-> monitor
consul .[#e76a9b]-> database
consul .[#e76a9b]-> gitaly_cluster
consul .[#e76a9b,norank]--> redis
@@ -133,22 +137,34 @@ monitor .[#7FFFD4,norank]u--> elb
@enduml
```
-The Google Cloud Platform (GCP) architectures were built and tested using the
+## Requirements
+
+Before starting, you should take note of the following requirements / guidance for this reference architecture.
+
+### Supported CPUs
+
+This reference architecture was built and tested on Google Cloud Platform (GCP) using the
[Intel Xeon E5 v3 (Haswell)](https://cloud.google.com/compute/docs/cpu-platforms)
CPU platform. On different hardware you may find that adjustments, either lower
or higher, are required for your CPU or node counts. For more information, see
our [Sysbench](https://github.com/akopytov/sysbench)-based
[CPU benchmarks](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Reference-Architectures/GCP-CPU-Benchmarks).
-Due to better performance and availability, for data objects (such as LFS,
-uploads, or artifacts), using an [object storage service](#configure-the-object-storage)
-is recommended instead of using NFS. Using an object storage service also
-doesn't require you to provision and maintain a node.
+### Supported infrastructure
+
+As a general guidance, GitLab should run on most infrastructure such as reputable Cloud Providers (AWS, GCP, Azure) and their services, or self managed (ESXi) that meet both the specs detailed above, as well as any requirements in this section. However, this does not constitute a guarantee for every potential permutation.
+
+Be aware of the following specific call outs:
+
+- [Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/#:~:text=Azure%20Database%20for%20PostgreSQL%20is,high%20availability%2C%20and%20dynamic%20scalability.) is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to known performance issues or missing features.
+- [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/) is recommended to be configured with [Premium accounts](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-block-blob-premium) to ensure consistent performance.
+
+### Praefect PostgreSQL
-It's also worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
+It's worth noting that at this time [Praefect requires its own database server](../gitaly/praefect.md#postgresql) and
that to achieve full High Availability a third-party PostgreSQL database solution will be required.
We hope to offer a built in solutions for these restrictions in the future but in the meantime a non HA PostgreSQL server
-can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398)
+can be set up via Omnibus GitLab, which the above specs reflect. Refer to the following issues for more information: [`omnibus-gitlab#5919`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5919) & [`gitaly#3398`](https://gitlab.com/gitlab-org/gitaly/-/issues/3398).
## Setup components
@@ -760,14 +776,15 @@ run: sentinel: (pid 30098) 76832s; run: log: (pid 29704) 76850s
## Configure PostgreSQL
-In this section, you'll be guided through configuring an external PostgreSQL database
-to be used with GitLab.
+In this section, you'll be guided through configuring a highly available PostgreSQL
+cluster to be used with GitLab.
### Provide your own PostgreSQL instance
If you're hosting GitLab on a cloud provider, you can optionally use a
-managed service for PostgreSQL. For example, AWS offers a managed Relational
-Database Service (RDS) that runs PostgreSQL.
+managed service for PostgreSQL.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
If you use a cloud-managed service, or provide your own PostgreSQL:
@@ -777,12 +794,25 @@ If you use a cloud-managed service, or provide your own PostgreSQL:
needs privileges to create the `gitlabhq_production` database.
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring the GitLab Rails application](#configure-gitlab-rails).
+1. For improved performance, configuring [Database Load Balancing](../postgresql/database_load_balancing.md)
+ with multiple read replicas is recommended.
See [Configure GitLab using an external PostgreSQL service](../postgresql/external.md) for
further configuration steps.
### Standalone PostgreSQL using Omnibus GitLab
+The recommended Omnibus GitLab configuration for a PostgreSQL cluster with
+replication and failover requires:
+
+- A minimum of three PostgreSQL nodes.
+- A minimum of three Consul server nodes.
+- A minimum of three PgBouncer nodes that track and handle primary database reads and writes.
+ - An [internal load balancer](#configure-the-internal-load-balancer) (TCP) to balance requests between the PgBouncer nodes.
+- [Database Load Balancing](../postgresql/database_load_balancing.md) enabled.
+
+ A local PgBouncer service to be configured on each PostgreSQL node. Note that this is separate from the main PgBouncer cluster that tracks the primary.
+
The following IPs will be used as an example:
- `10.6.0.31`: PostgreSQL primary
@@ -837,8 +867,8 @@ in the second step, do not supply the `EXTERNAL_URL` value.
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
- # Disable all components except Patroni and Consul
- roles(['patroni_role'])
+ # Disable all components except Patroni, PgBouncer and Consul
+ roles(['patroni_role', 'pgbouncer_role'])
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
@@ -883,6 +913,15 @@ in the second step, do not supply the `EXTERNAL_URL` value.
# Replace 10.6.0.0/24 with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32)
+ # Local PgBouncer service for Database Load Balancing
+ pgbouncer['databases'] = {
+ gitlabhq_production: {
+ host: "127.0.0.1",
+ user: "pgbouncer",
+ password: '<pgbouncer_password_hash>'
+ }
+ }
+
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
@@ -943,9 +982,11 @@ If the 'State' column for any node doesn't say "running", check the
</a>
</div>
-## Configure PgBouncer
+### Configure PgBouncer
+
+Now that the PostgreSQL servers are all set up, let's configure PgBouncer
+for tracking and handling reads/writes to the primary database.
-Now that the PostgreSQL servers are all set up, let's configure PgBouncer.
The following IPs will be used as an example:
- `10.6.0.21`: PgBouncer 1
@@ -1167,7 +1208,14 @@ There are many third-party solutions for PostgreSQL HA. The solution selected mu
- A static IP for all connections that doesn't change on failover.
- [`LISTEN`](https://www.postgresql.org/docs/12/sql-listen.html) SQL functionality must be supported.
-Examples of the above could include [Google's Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) or [Amazon RDS](https://aws.amazon.com/rds/).
+NOTE:
+With a third-party setup, it's possible to colocate Praefect's database on the same server as
+the main [GitLab](#provide-your-own-postgresql-instance) database as a convenience unless
+you are using Geo, where separate database instances are required for handling replication correctly.
+In this setup, the specs of the main database setup shouldn't need to be changed as the impact should be
+minimal.
+
+A reputable provider or solution should be used for this. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61).
Once the database is set up, follow the [post configuration](#praefect-postgresql-post-configuration).
@@ -1604,8 +1652,8 @@ To configure the Sidekiq nodes, one each one:
gitlab_rails['db_host'] = '10.6.0.40' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
- gitlab_rails['db_adapter'] = 'postgresql'
- gitlab_rails['db_encoding'] = 'unicode'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.31', '10.6.0.32', '10.6.0.33'] } # PostgreSQL IPs
+
## Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -1764,6 +1812,8 @@ On each node perform the following:
gitlab_rails['db_host'] = '10.6.0.20' # internal load balancer IP
gitlab_rails['db_port'] = 6432
gitlab_rails['db_password'] = '<postgresql_user_password>'
+ gitlab_rails['db_load_balancing'] = { 'hosts' => ['10.6.0.31', '10.6.0.32', '10.6.0.33'] } # PostgreSQL IPs
+
# Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -2068,8 +2118,7 @@ cluster alongside your instance, read how to
## Configure NFS (optional)
[Object storage](#configure-the-object-storage), along with [Gitaly](#configure-gitaly)
-are recommended over NFS wherever possible for improved performance. If you intend
-to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pages-requires-nfs).
+are recommended over NFS wherever possible for improved performance.
See how to [configure NFS](../nfs.md).
@@ -2141,7 +2190,7 @@ services where applicable):
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
-1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and AWS RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
+1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
@@ -2153,25 +2202,21 @@ For all PaaS solutions that involve configuring instances, it is strongly recomm
```plantuml
@startuml 5k
+skinparam linetype ortho
card "Kubernetes via Helm Charts" as kubernetes {
card "**External Load Balancer**" as elb #6a9be7
together {
- collections "**Webservice** x5" as gitlab #32CD32
- collections "**Sidekiq** x3" as sidekiq #ff8dd1
+ collections "**Webservice** x4" as gitlab #32CD32
+ collections "**Sidekiq** x4" as sidekiq #ff8dd1
}
- card "**Prometheus + Grafana**" as monitor #7FFFD4
card "**Supporting Services**" as support
}
card "**Internal Load Balancer**" as ilb #9370DB
-
-card "**Consul + Sentinel**" as consul_sentinel {
- collections "**Consul** x3" as consul #e76a9b
- collections "**Redis Sentinel** x3" as sentinel #e6e727
-}
+collections "**Consul** x3" as consul #e76a9b
card "Gitaly Cluster" as gitaly_cluster {
collections "**Praefect** x3" as praefect #FF8C00
@@ -2191,41 +2236,33 @@ card "Database" as database {
postgres_primary .[#4EA7FF]> postgres_secondary
}
-card "Redis" as redis {
+card "redis" as redis {
collections "**Redis** x3" as redis_nodes #FF6347
-
- redis_nodes <.[#FF6347]- sentinel
}
cloud "**Object Storage**" as object_storage #white
elb -[#6a9be7]-> gitlab
-elb -[#6a9be7]-> monitor
+elb -[hidden]-> sidekiq
elb -[hidden]-> support
gitlab -[#32CD32]--> ilb
-gitlab -[#32CD32]-> object_storage
-gitlab -[#32CD32]---> redis
-gitlab -[hidden]--> consul
+gitlab -[#32CD32]r--> object_storage
+gitlab -[#32CD32,norank]----> redis
+gitlab -[#32CD32]----> database
sidekiq -[#ff8dd1]--> ilb
-sidekiq -[#ff8dd1]-> object_storage
-sidekiq -[#ff8dd1]---> redis
-sidekiq -[hidden]--> consul
-
-ilb -[#9370DB]-> gitaly_cluster
-ilb -[#9370DB]-> database
+sidekiq -[#ff8dd1]r--> object_storage
+sidekiq -[#ff8dd1,norank]----> redis
+sidekiq .[#ff8dd1]----> database
-consul .[#e76a9b]-> database
-consul .[#e76a9b]-> gitaly_cluster
-consul .[#e76a9b,norank]--> redis
+ilb -[#9370DB]--> gitaly_cluster
+ilb -[#9370DB]--> database
+ilb -[hidden,norank]--> redis
-monitor .[#7FFFD4]> consul
-monitor .[#7FFFD4]-> database
-monitor .[#7FFFD4]-> gitaly_cluster
-monitor .[#7FFFD4,norank]--> redis
-monitor .[#7FFFD4]> ilb
-monitor .[#7FFFD4,norank]u--> elb
+consul .[#e76a9b]--> database
+consul .[#e76a9b,norank]--> gitaly_cluster
+consul .[#e76a9b]--> redis
@enduml
```
diff --git a/doc/administration/reference_architectures/index.md b/doc/administration/reference_architectures/index.md
index 4d95a61176b..6bf35ba6e22 100644
--- a/doc/administration/reference_architectures/index.md
+++ b/doc/administration/reference_architectures/index.md
@@ -166,7 +166,7 @@ that can also be promoted in case of disaster.
## Deviating from the suggested reference architectures
-As a general rule of thumb, the further away you move from the Reference Architectures,
+As a general guideline, the further away you move from the Reference Architectures,
the harder it will be get support for it. With any deviation, you're introducing
a layer of complexity that will add challenges to finding out where potential
issues might lie.
@@ -191,3 +191,36 @@ The reference architectures for user counts [3,000](3k_users.md) and up support
In the specific case you have the requirement to achieve HA but have a lower user count, select modifications to the [3,000 user](3k_users.md) architecture are supported.
For more details, [refer to this section in the architecture's documentation](3k_users.md#supported-modifications-for-lower-user-counts-ha).
+
+## Testing process and results
+
+The [Quality Engineering - Enablement team](https://about.gitlab.com/handbook/engineering/quality/quality-engineering/) does regular smoke and performance tests for the reference architectures to ensure they remain compliant.
+
+In this section, we detail some of the process as well as the results.
+
+Note the following about the testing process:
+
+- Testing occurs against all main reference architectures and cloud providers in an automated and ad-hoc fashion.
+ This is achieved through two tools built by the team:
+ - The [GitLab Environment Toolkit](https://gitlab.com/gitlab-org/quality/gitlab-environment-toolkit) for building the environments.
+ - The [GitLab Performance Tool](https://gitlab.com/gitlab-org/quality/performance) for performance testing.
+- Network latency on the test environments between components on all Cloud Providers were measured at <5ms. Note that this is shared as an observation and not as an implicit recommendation.
+- We aim to have a "test smart" approach where architectures tested have a good range that can also apply to others. Testing focuses on 10k Omnibus on GCP as the testing has shown this is a good bellwether for the other architectures and cloud providers as well as Cloud Native Hybrids.
+- Testing is done publicly and all results are shared.
+
+Τhe following table details the testing done against the reference architectures along with the frequency and results.
+
+| Reference Architecture | Tests Run<sup>1</sup> |
+|------------------------|----------------------------------------------------------------------------------------------------------------------|
+| 1k | [Omnibus - Daily (GCP)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/1k)<sup>2</sup> |
+| 2k | [Omnibus - Daily (GCP)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/2k)<sup>2</sup> |
+| 3k | [Omnibus - Weekly (GCP)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/3k)<sup>2</sup> |
+| 5k | [Omnibus - Weekly (GCP)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/5k)<sup>2</sup> |
+| 10k | [Omnibus - Daily (GCP)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/10k)<sup>2</sup><br/>[Omnibus - Ad-Hoc (GCP, AWS, Azure)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Past-Results/10k)<br/><br/>[Cloud Native Hybrid - Ad-Hoc (GCP, AWS)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Past-Results/10k-Cloud-Native-Hybrid) |
+| 25k | [Omnibus - Weekly (GCP)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/25k)<sup>2</sup><br/>[Omnibus - Ad-Hoc (Azure)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Past-Results/25k) |
+| 50k | [Omnibus - Weekly (GCP)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Benchmarks/Latest/50k)<sup>2</sup><br/>[Omnibus - Ad-Hoc (AWS)](https://gitlab.com/gitlab-org/quality/performance/-/wikis/Past-Results/50k) |
+
+Note that:
+
+1. The list above is non exhaustive. Additional testing is continuously evaluated and iterated on, and the table is updated regularly.
+1. The Omnibus reference architectures are VM-based only and testing has shown that they perform similarly on equivalently specced hardware regardless of Cloud Provider or if run on premises.
diff --git a/doc/administration/reference_architectures/troubleshooting.md b/doc/administration/reference_architectures/troubleshooting.md
index aabf4809b4a..c8c13fca59d 100644
--- a/doc/administration/reference_architectures/troubleshooting.md
+++ b/doc/administration/reference_architectures/troubleshooting.md
@@ -20,15 +20,14 @@ with the Fog library that GitLab uses. Symptoms include:
411 Length Required
```
-### GitLab Pages requires NFS
+### GitLab Pages can use object storage
-If you intend to use [GitLab Pages](../../user/project/pages/index.md), this currently requires
-[NFS](../nfs.md). There is [work in progress](https://gitlab.com/groups/gitlab-org/-/epics/3901)
-to remove this dependency. In the future, GitLab Pages will use
-object storage.
+If you intend to use [GitLab Pages](../../user/project/pages/index.md), you can
+[configure object storage](../pages/index.md#using-object-storage).
+NFS is still available if you prefer.
-The dependency on disk storage also prevents Pages being deployed using the
-[GitLab Helm chart](https://gitlab.com/groups/gitlab-org/-/epics/4283).
+The [GitLab Pages Helm chart](https://docs.gitlab.com/charts/charts/gitlab/gitlab-pages/) is also available
+for Kubernetes deployments.
### Incremental logging is required for CI to use object storage
diff --git a/doc/administration/repository_storage_types.md b/doc/administration/repository_storage_types.md
index a85f678fe95..f33d494f638 100644
--- a/doc/administration/repository_storage_types.md
+++ b/doc/administration/repository_storage_types.md
@@ -101,10 +101,10 @@ To look up a project's hash path using a Rails console:
#### From hashed path to project name
-Administrators can look up a project's name from its hashed storage path using:
+Administrators can look up a project's name from its hashed storage path using:
- A Rails console.
-- The `config` file in the `*.git` directory.
+- The `config` file in the `*.git` directory.
To look up a project's name using the Rails console:
diff --git a/doc/administration/terraform_state.md b/doc/administration/terraform_state.md
index 388ae74f207..582ffc9dc9c 100644
--- a/doc/administration/terraform_state.md
+++ b/doc/administration/terraform_state.md
@@ -130,6 +130,28 @@ For GitLab 13.8 and earlier versions, you can use a workaround for the Rake task
end
```
+You can optionally track progress and verify that all packages migrated successfully using the
+[PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database):
+
+- `sudo gitlab-rails dbconsole` for Omnibus GitLab instances.
+- `sudo -u git -H psql -d gitlabhq_production` for source-installed instances.
+
+Verify `objectstg` below (where `store=2`) has count of all states:
+
+```shell
+gitlabhq_production=# SELECT count(*) AS total, sum(case when store = '1' then 1 else 0 end) AS filesystem, sum(case when store = '2' then 1 else 0 end) AS objectstg FROM terraform_states;
+
+total | filesystem | objectstg
+------+------------+-----------
+ 15 | 0 | 15
+```
+
+Verify that there are no files on disk in the `terraform_state` folder:
+
+```shell
+sudo find /var/opt/gitlab/gitlab-rails/shared/terraform_state -type f | wc -l
+```
+
### S3-compatible connection settings
See [the available connection settings for different providers](object_storage.md#connection-settings).
@@ -162,11 +184,7 @@ See [the available connection settings for different providers](object_storage.m
```
1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-1. Migrate any existing local states to the object storage (GitLab 13.9 and later):
-
- ```shell
- gitlab-rake gitlab:terraform_states:migrate
- ```
+1. [Migrate any existing local states to the object storage](#migrate-to-object-storage)
**In installations from source:**
@@ -187,8 +205,4 @@ See [the available connection settings for different providers](object_storage.m
```
1. Save the file and [restart GitLab](restart_gitlab.md#installations-from-source) for the changes to take effect.
-1. Migrate any existing local states to the object storage (GitLab 13.9 and later):
-
- ```shell
- sudo -u git -H bundle exec rake gitlab:terraform_states:migrate RAILS_ENV=production
- ```
+1. [Migrate any existing local states to the object storage](#migrate-to-object-storage)
diff --git a/doc/administration/troubleshooting/elasticsearch.md b/doc/administration/troubleshooting/elasticsearch.md
index cfce3b94554..c45938ecd3f 100644
--- a/doc/administration/troubleshooting/elasticsearch.md
+++ b/doc/administration/troubleshooting/elasticsearch.md
@@ -4,7 +4,7 @@ group: Global Search
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
---
-# Troubleshooting Elasticsearch
+# Troubleshooting Elasticsearch **(PREMIUM SELF)**
To install and configure Elasticsearch, and for common and known issues,
visit the [administrator documentation](../../integration/elasticsearch.md).
diff --git a/doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md b/doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md
index 87f514a2fdd..ccfa93d9bc8 100644
--- a/doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md
+++ b/doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md
@@ -97,14 +97,15 @@ Rails.cache.instance_variable_get(:@data).keys
## Profile a page
```ruby
+url = '<url/of/the/page>'
+
# Before 11.6.0
logger = Logger.new($stdout)
-admin_token = User.find_by_username('ADMIN_USERNAME').personal_access_tokens.first.token
-app.get("URL/?private_token=#{admin_token}")
+admin_token = User.find_by_username('<admin-username>').personal_access_tokens.first.token
+app.get("#{url}/?private_token=#{admin_token}")
# From 11.6.0
-admin = User.find_by_username('ADMIN_USERNAME')
-url = "/url/goes/here"
+admin = User.find_by_username('<admin-username>')
Gitlab::Profiler.with_user(admin) { app.get(url) }
```
@@ -112,8 +113,8 @@ Gitlab::Profiler.with_user(admin) { app.get(url) }
```ruby
logger = Logger.new($stdout)
-admin = User.find_by_username('ADMIN_USERNAME')
-Gitlab::Profiler.profile('URL', logger: logger, user: admin)
+admin = User.find_by_username('<admin-username>')
+Gitlab::Profiler.profile('<url/of/the/page>', logger: logger, user: admin)
```
## Time an operation
@@ -414,12 +415,14 @@ p.create_wiki ### creates the wiki project on the filesystem
### In case of issue boards not loading properly and it's getting time out. We need to call the Issue Rebalancing service to fix this
```ruby
-p = Project.find_by_full_path('PROJECT PATH')
+p = Project.find_by_full_path('<username-or-group>/<project-name>')
Issues::RelativePositionRebalancingService.new(p.root_namespace.all_projects).execute
```
-## Imports / Exports
+## Imports and exports
+
+### Import a project
```ruby
# Find the project and get the error
@@ -462,18 +465,19 @@ Clear the cache:
sudo gitlab-rake cache:clear
```
-### Export a repository
+### Export a project
It's typically recommended to export a project through [the web interface](../../user/project/settings/import_export.md#export-a-project-and-its-data) or through [the API](../../api/project_import_export.md). In situations where this is not working as expected, it may be preferable to export a project directly via the Rails console:
```ruby
-user = User.find_by_username('USERNAME')
-project = Project.find_by_full_path('PROJECT_PATH')
+user = User.find_by_username('<username>')
+# Sufficient permissions needed
+# Read https://docs.gitlab.com/ee/user/permissions.html#project-members-permissions
+
+project = Project.find_by_full_path('<username-or-group>/<project-name')
Projects::ImportExport::ExportService.new(project, user).execute
```
-If the project you wish to export is available at `https://gitlab.example.com/baltig/pipeline-templates`, the value to use for `PROJECT_PATH` would be `baltig/pipeline-templates`.
-
If this all runs successfully, you see an output like the following before being returned to the Rails console prompt:
```ruby
@@ -482,6 +486,11 @@ If this all runs successfully, you see an output like the following before being
The exported project is located within a `.tar.gz` file in `/var/opt/gitlab/gitlab-rails/uploads/-/system/import_export_upload/export_file/`.
+If this fails, [enable verbose logging](navigating_gitlab_via_rails_console.md#looking-up-database-persisted-objects),
+repeat the above procedure after,
+and report the output to
+[GitLab Support](https://about.gitlab.com/support/).
+
## Repository
### Search sequence of pushes to a repository
@@ -586,7 +595,7 @@ User.active.count
User.billable.count
# The historical max on the instance as of the past year
-::HistoricalData.max_historical_user_count
+::HistoricalData.max_historical_user_count(from: 1.year.ago.beginning_of_day, to: Time.current.end_of_day)
```
Using cURL and jq (up to a max 100, see the [pagination docs](../../api/index.md#pagination)):
@@ -618,7 +627,7 @@ users.count
# If that count looks sane:
# You can either block the users:
-users.each { |user| user.block! }
+users.each { |user| user.blocked? ? nil : user.block! }
# Or you can delete them:
# need 'current user' (your user) for auditing purposes
@@ -782,7 +791,7 @@ end
emails = [email1, email2]
emails.each do |e|
- delete_bad_scim(e,'GROUPPATH')
+ delete_bad_scim(e,'<group-path>')
end
```
@@ -815,28 +824,28 @@ conflicting_permanent_redirects.destroy_all
### Close a merge request properly (if merged but still marked as open)
```ruby
-p = Project.find_by_full_path('<full/path/to/project>')
-m = p.merge_requests.find_by(iid: <iid>)
u = User.find_by_username('<username>')
-MergeRequests::PostMergeService.new(p, u).execute(m)
+p = Project.find_by_full_path('<namespace/project>')
+m = p.merge_requests.find_by(iid: <iid>)
+MergeRequests::PostMergeService.new(project: p, current_user: u).execute(m)
```
### Delete a merge request
```ruby
u = User.find_by_username('<username>')
-p = Project.find_by_full_path('<group>/<project>')
-m = p.merge_requests.find_by(iid: <IID>)
-Issuable::DestroyService.new(m.project, u).execute(m)
+p = Project.find_by_full_path('<namespace/project>')
+m = p.merge_requests.find_by(iid: <iid>)
+Issuable::DestroyService.new(project: m.project, current_user: u).execute(m)
```
### Rebase manually
```ruby
-p = Project.find_by_full_path('<project_path>')
-m = project.merge_requests.find_by(iid: )
u = User.find_by_username('<username>')
-MergeRequests::RebaseService.new(m.target_project, u).execute(m)
+p = Project.find_by_full_path('<namespace/project>')
+m = p.merge_requests.find_by(iid: <iid>)
+MergeRequests::RebaseService.new(project: m.target_project, current_user: u).execute(m)
```
## CI
@@ -1255,6 +1264,9 @@ registry.replicator.send(:sync_repository)
## Generate Service Ping
+The [Service Ping Guide](../../development/service_ping/index.md) in our developer documentation
+has more information about Service Ping.
+
### Generate or get the cached Service Ping
```ruby
@@ -1277,6 +1289,12 @@ Generates Service Ping data in JSON format.
rake gitlab:usage_data:generate
```
+Generates Service Ping data in YAML format:
+
+```shell
+rake gitlab:usage_data:dump_sql_in_yaml
+```
+
### Generate and send Service Ping
Prints the metrics saved in `conversational_development_index_metrics`.
diff --git a/doc/administration/troubleshooting/group_saml_scim.md b/doc/administration/troubleshooting/group_saml_scim.md
index 9e9ef492ebd..d052688363c 100644
--- a/doc/administration/troubleshooting/group_saml_scim.md
+++ b/doc/administration/troubleshooting/group_saml_scim.md
@@ -72,6 +72,10 @@ Self-managed instance example:
![Okta admin panel view](img/okta_admin_panel_v13_9.png)
+Setting the username for the newly provisioned users when assigning them the SCIM app:
+
+![Assigning SCIM app to users on Okta](img/okta_setting_username.png)
+
## OneLogin
Application details:
diff --git a/doc/administration/troubleshooting/img/okta_setting_username.png b/doc/administration/troubleshooting/img/okta_setting_username.png
new file mode 100644
index 00000000000..c413b9d3a27
--- /dev/null
+++ b/doc/administration/troubleshooting/img/okta_setting_username.png
Binary files differ
diff --git a/doc/administration/troubleshooting/img/sidekiq_flamegraph.png b/doc/administration/troubleshooting/img/sidekiq_flamegraph.png
new file mode 100644
index 00000000000..89d6e8da3ce
--- /dev/null
+++ b/doc/administration/troubleshooting/img/sidekiq_flamegraph.png
Binary files differ
diff --git a/doc/administration/troubleshooting/navigating_gitlab_via_rails_console.md b/doc/administration/troubleshooting/navigating_gitlab_via_rails_console.md
index 57d64a2323e..91db321295d 100644
--- a/doc/administration/troubleshooting/navigating_gitlab_via_rails_console.md
+++ b/doc/administration/troubleshooting/navigating_gitlab_via_rails_console.md
@@ -9,7 +9,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
At the heart of GitLab is a web application [built using the Ruby on Rails
framework](https://about.gitlab.com/blog/2018/10/29/why-we-use-rails-to-build-gitlab/).
Thanks to this, we also get access to the amazing tools built right into Rails.
-In this guide, we'll introduce the [Rails console](../operations/rails_console.md#starting-a-rails-console-session)
+This guide introduces the [Rails console](../operations/rails_console.md#starting-a-rails-console-session)
and the basics of interacting with your GitLab instance from the command line.
WARNING:
@@ -19,7 +19,7 @@ or destroying production data. If you would like to explore the Rails console
with no consequences, you are strongly advised to do so in a test environment.
This guide is targeted at GitLab system administrators who are troubleshooting
-a problem or need to retrieve some data that can only be done through direct
+a problem or must retrieve some data that can only be done through direct
access of the GitLab application. Basic knowledge of Ruby is needed (try [this
30-minute tutorial](https://try.ruby-lang.org/) for a quick introduction).
Rails experience is helpful to have but not a must.
@@ -29,7 +29,7 @@ Rails experience is helpful to have but not a must.
Your type of GitLab installation determines how
[to start a rails console](../operations/rails_console.md).
-The following code examples will all take place inside the Rails console and also
+The following code examples take place inside the Rails console and also
assume an Omnibus GitLab installation.
## Active Record objects
@@ -37,7 +37,7 @@ assume an Omnibus GitLab installation.
### Looking up database-persisted objects
Under the hood, Rails uses [Active Record](https://guides.rubyonrails.org/active_record_basics.html),
-an object-relational mapping system, to read, write and map application objects
+an object-relational mapping system, to read, write, and map application objects
to the PostgreSQL database. These mappings are handled by Active Record models,
which are Ruby classes defined in a Rails app. For GitLab, the model classes
can be found at `/opt/gitlab/embedded/service/gitlab-rails/app/models`.
@@ -144,7 +144,7 @@ NoMethodError (undefined method `username' for #<ActiveRecord::Relation [#<User
Did you mean? by_username
```
-We need to retrieve the single object from the collection by using the `.first`
+Let's retrieve the single object from the collection by using the `.first`
method to get the first item in the collection:
```ruby
@@ -164,7 +164,7 @@ Record, please see the [Active Record Query Interface documentation](https://gui
### Modifying Active Record objects
In the previous section, we learned about retrieving database records using
-Active Record. Now, we'll learn how to write changes to the database.
+Active Record. Now, let's learn how to write changes to the database.
First, let's retrieve the `root` user:
@@ -195,7 +195,7 @@ a background job to deliver an email notification. This is an example of an
-- code which is designated to run in response to events in the Active Record
object life cycle. This is also why using the Rails console is preferred when
direct changes to data is necessary as changes made via direct database queries
-will not trigger these callbacks.
+do not trigger these callbacks.
It's also possible to update attributes in a single line:
@@ -265,8 +265,8 @@ user.save!(validate: false)
This is not recommended, as validations are usually put in place to ensure the
integrity and consistency of user-provided data.
-A validation error will prevent the entire object from being saved to
-the database. We'll see a little of this in the next section. If you're getting
+A validation error prevents the entire object from being saved to
+the database. You can see a little of this in the section below. If you're getting
a mysterious red banner in the GitLab UI when submitting a form, this can often
be the fastest way to get to the root of the problem.
@@ -336,7 +336,7 @@ user.activate
user.state
```
-Earlier, we mentioned that a validation error will prevent the entire object
+Earlier, we mentioned that a validation error prevents the entire object
from being saved to the database. Let's see how this can have unexpected
interactions:
@@ -455,7 +455,7 @@ Ci::Build.find(66124)
```
The pipeline and job ID numbers increment globally across your GitLab
-instance, so there's no need to use an internal ID attribute to look them up,
+instance, so there's no requirement to use an internal ID attribute to look them up,
unlike with issues or merge requests.
**Get the current application settings object:**
diff --git a/doc/administration/troubleshooting/sidekiq.md b/doc/administration/troubleshooting/sidekiq.md
index 7a8ac8c3dbe..a606a3712ba 100644
--- a/doc/administration/troubleshooting/sidekiq.md
+++ b/doc/administration/troubleshooting/sidekiq.md
@@ -85,6 +85,27 @@ several `WARN` level messages. Here's an example of a single thread's backtrace:
In some cases Sidekiq may be hung and unable to respond to the `TTIN` signal.
Move on to other troubleshooting methods if this happens.
+## Ruby profiling with `rbspy`
+
+[rbspy](https://rbspy.github.io) is an easy to use and low-overhead Ruby profiler that can be used to create
+flamegraph-style diagrams of CPU usage by Ruby processes.
+
+No changes to GitLab are required to use it and it has no dependencies. To install it:
+
+1. Download the binary from the [`rbspy` releases page](https://github.com/rbspy/rbspy/releases).
+1. Make the binary executable.
+
+To profile a Sidekiq worker for one minute, run:
+
+```shell
+sudo ./rbspy record --pid <sidekiq_pid> --duration 60 --file /tmp/sidekiq_profile.svg
+```
+
+![Example rbspy flamegraph](img/sidekiq_flamegraph.png)
+
+In this example of a flamegraph generated by `rbspy`, almost all of the Sidekiq process's time is spent in `rev_parse`, a native C
+function in Rugged. In the stack, we can see `rev_parse` is being called by the `ExpirePipelineCacheWorker`.
+
## Process profiling with `perf`
Linux has a process profiling tool called `perf` that is helpful when a certain
@@ -358,3 +379,17 @@ has number of drawbacks, as mentioned in [Why Ruby's Timeout is dangerous (and T
> - in any of your code, regardless of whether it could have possibly raised an exception before
>
> Nobody writes code to defend against an exception being raised on literally any line. That's not even possible. So Thread.raise is basically like a sneak attack on your code that could result in almost anything. It would probably be okay if it were pure-functional code that did not modify any state. But this is Ruby, so that's unlikely :)
+
+## Disable Rugged
+
+Calls into Rugged, Ruby bindings for `libgit2`, [lock the Sidekiq processes's GVL](https://silverhammermba.github.io/emberb/c/#c-in-ruby-threads),
+blocking all jobs on that worker from proceeding. If Rugged calls performed by Sidekiq are slow, this can cause significant delays in
+background task processing.
+
+By default, Rugged is used when Git repository data is stored on local storage or on an NFS mount.
+[Using Rugged is recommened when using NFS](../nfs.md#improving-nfs-performance-with-gitlab), but if
+you are using local storage, disabling Rugged can improve Sidekiq performance:
+
+```shell
+sudo gitlab-rake gitlab:features:disable_rugged
+```
diff --git a/doc/administration/troubleshooting/tracing_correlation_id.md b/doc/administration/troubleshooting/tracing_correlation_id.md
index 3bafbed4b3f..3a0c6a30cde 100644
--- a/doc/administration/troubleshooting/tracing_correlation_id.md
+++ b/doc/administration/troubleshooting/tracing_correlation_id.md
@@ -2,13 +2,12 @@
stage: Enablement
group: Distribution
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
-type: reference
---
# Finding relevant log entries with a correlation ID **(FREE SELF)**
-In GitLab 11.6 and later, a unique request tracking ID, known as the "correlation ID" has been
-logged by the GitLab instance for most requests. Each individual request to GitLab gets
+GitLab instances log a unique request tracking ID (known as the
+"correlation ID") for most requests. Each individual request to GitLab gets
its own correlation ID, which then gets logged in each GitLab component's logs for that
request. This makes it easier to trace behavior in a
distributed system. Without this ID it can be difficult or
@@ -147,7 +146,7 @@ First, enable the **Developer Tools** panel. See [Getting the correlation ID in
After developer tools have been enabled, obtain a session cookie as follows:
1. Visit <https://gitlab.com> while logged in.
-1. (Optional) Select **Fetch/XHR** request filter in the **Developer Tools** panel. This step is described for Google Chrome developer tools and is not strictly necessary, it just makes it easier to find the correct request.
+1. Optional. Select **Fetch/XHR** request filter in the **Developer Tools** panel. This step is described for Google Chrome developer tools and is not strictly necessary, it just makes it easier to find the correct request.
1. Select the `results?request_id=<some-request-id>` request on the left hand side.
1. The session cookie is displayed under the `Request Headers` section of the `Headers` panel. Right-click on the cookie value and select `Copy value`.
diff --git a/doc/administration/uploads.md b/doc/administration/uploads.md
index 15ef024647c..55c3f85bfb9 100644
--- a/doc/administration/uploads.md
+++ b/doc/administration/uploads.md
@@ -51,12 +51,6 @@ _The uploads are stored by default in
## Using object storage **(FREE SELF)**
-> **Notes:**
->
-> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/3867) in GitLab 10.5.
-> - [Moved](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/17358) from GitLab Premium to GitLab Free in 10.7.
-> - Since version 11.1, we support direct_upload to S3.
-
If you don't want to use the local disk where GitLab is installed to store the
uploads, you can use an object storage provider like AWS S3 instead.
This configuration relies on valid AWS credentials to be configured already.
@@ -112,24 +106,7 @@ _The uploads are stored by default in
```
1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
-1. Migrate any existing local uploads to the object storage using [`gitlab:uploads:migrate` Rake task](raketasks/uploads/migrate.md).
-1. Optional: Verify all files migrated properly.
- From [PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database)
- (`sudo gitlab-psql -d gitlabhq_production`) verify `objectstg` below (where `store=2`) has count of all artifacts:
-
- ```shell
- gitlabhq_production=# SELECT count(*) AS total, sum(case when store = '1' then 1 else 0 end) AS filesystem, sum(case when store = '2' then 1 else 0 end) AS objectstg FROM uploads;
-
- total | filesystem | objectstg
- ------+------------+-----------
- 2409 | 0 | 2409
- ```
-
- Verify no files on disk in `artifacts` folder:
-
- ```shell
- sudo find /var/opt/gitlab/gitlab-rails/uploads -type f | grep -v tmp | wc -l
- ```
+1. Migrate any existing local uploads to the object storage using [`gitlab:uploads:migrate:all` Rake task](raketasks/uploads/migrate.md).
**In installations from source:**
@@ -153,22 +130,6 @@ _The uploads are stored by default in
1. Save the file and [restart GitLab](restart_gitlab.md#installations-from-source) for the changes to take effect.
1. Migrate any existing local uploads to the object storage using [`gitlab:uploads:migrate:all` Rake task](raketasks/uploads/migrate.md).
-1. Optional: Verify all files migrated properly.
- From PostgreSQL console (`sudo -u git -H psql -d gitlabhq_production`) verify `objectstg` below (where `file_store=2`) has count of all artifacts:
-
- ```shell
- gitlabhq_production=# SELECT count(*) AS total, sum(case when store = '1' then 1 else 0 end) AS filesystem, sum(case when store = '2' then 1 else 0 end) AS objectstg FROM uploads;
-
- total | filesystem | objectstg
- ------+------------+-----------
- 2409 | 0 | 2409
- ```
-
- Verify no files on disk in `artifacts` folder:
-
- ```shell
- sudo find /var/opt/gitlab/gitlab-rails/uploads -type f | grep -v tmp | wc -l
- ```
#### OpenStack example
@@ -195,23 +156,6 @@ _The uploads are stored by default in
1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
1. Migrate any existing local uploads to the object storage using [`gitlab:uploads:migrate:all` Rake task](raketasks/uploads/migrate.md).
-1. Optional: Verify all files migrated properly.
- From [PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database)
- (`sudo gitlab-psql -d gitlabhq_production`) verify `objectstg` below (where `store=2`) has count of all artifacts:
-
- ```shell
- gitlabhq_production=# SELECT count(*) AS total, sum(case when store = '1' then 1 else 0 end) AS filesystem, sum(case when store = '2' then 1 else 0 end) AS objectstg FROM uploads;
-
- total | filesystem | objectstg
- ------+------------+-----------
- 2409 | 0 | 2409
- ```
-
- Verify no files on disk in `artifacts` folder:
-
- ```shell
- sudo find /var/opt/gitlab/gitlab-rails/uploads -type f | grep -v tmp | wc -l
- ```
---
@@ -243,19 +187,3 @@ _The uploads are stored by default in
1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
1. Migrate any existing local uploads to the object storage using [`gitlab:uploads:migrate:all` Rake task](raketasks/uploads/migrate.md).
-1. Optional: Verify all files migrated properly.
- From PostgreSQL console (`sudo -u git -H psql -d gitlabhq_production`) verify `objectstg` below (where `file_store=2`) has count of all artifacts:
-
- ```shell
- gitlabhq_production=# SELECT count(*) AS total, sum(case when store = '1' then 1 else 0 end) AS filesystem, sum(case when store = '2' then 1 else 0 end) AS objectstg FROM uploads;
-
- total | filesystem | objectstg
- ------+------------+-----------
- 2409 | 0 | 2409
- ```
-
- Verify no files on disk in `artifacts` folder:
-
- ```shell
- sudo find /var/opt/gitlab/gitlab-rails/uploads -type f | grep -v tmp | wc -l
- ```