Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/geo')
-rw-r--r--doc/administration/geo/disaster_recovery/planned_failover.md6
-rw-r--r--doc/administration/geo/glossary.md27
-rw-r--r--doc/administration/geo/index.md3
-rw-r--r--doc/administration/geo/replication/configuration.md4
-rw-r--r--doc/administration/geo/replication/datatypes.md12
-rw-r--r--doc/administration/geo/replication/location_aware_git_url.md2
-rw-r--r--doc/administration/geo/replication/multiple_servers.md6
-rw-r--r--doc/administration/geo/replication/single_sign_on.md4
-rw-r--r--doc/administration/geo/replication/troubleshooting.md71
-rw-r--r--doc/administration/geo/replication/upgrading_the_geo_sites.md2
-rw-r--r--doc/administration/geo/secondary_proxy/index.md4
-rw-r--r--doc/administration/geo/setup/database.md9
-rw-r--r--doc/administration/geo/setup/external_database.md2
-rw-r--r--doc/administration/geo/setup/index.md4
14 files changed, 103 insertions, 53 deletions
diff --git a/doc/administration/geo/disaster_recovery/planned_failover.md b/doc/administration/geo/disaster_recovery/planned_failover.md
index 13e0938fa59..6ac67c3d21e 100644
--- a/doc/administration/geo/disaster_recovery/planned_failover.md
+++ b/doc/administration/geo/disaster_recovery/planned_failover.md
@@ -56,12 +56,12 @@ site you are about to failover to:
rsync --archive --perms --delete root@<geo-primary>:/var/opt/gitlab/gitlab-rails/shared/registry/. /var/opt/gitlab/gitlab-rails/shared/registry
```
-Alternatively, you can [back up](../../../raketasks/backup_restore.md#back-up-gitlab)
+Alternatively, you can [back up](../../../administration/backup_restore/index.md#back-up-gitlab)
the container registry on the primary site and restore it onto the secondary
site:
1. On your primary site, back up only the registry and
- [exclude specific directories from the backup](../../../raketasks/backup_gitlab.md#excluding-specific-directories-from-the-backup):
+ [exclude specific directories from the backup](../../../administration/backup_restore/backup_gitlab.md#excluding-specific-directories-from-the-backup):
```shell
# Create a backup in the /var/opt/gitlab/backups folder
@@ -71,7 +71,7 @@ site:
1. Copy the backup tarball generated from your primary site to the `/var/opt/gitlab/backups` folder
on your secondary site.
-1. On your secondary site, restore the registry following the [Restore GitLab](../../../raketasks/backup_restore.md#restore-gitlab)
+1. On your secondary site, restore the registry following the [Restore GitLab](../../../administration/backup_restore/index.md#restore-gitlab)
documentation.
## Preflight checks
diff --git a/doc/administration/geo/glossary.md b/doc/administration/geo/glossary.md
index 9abd7ea9347..2e9a637eb5c 100644
--- a/doc/administration/geo/glossary.md
+++ b/doc/administration/geo/glossary.md
@@ -19,19 +19,20 @@ these definitions yet.
We provide example diagrams and statements to demonstrate correct usage of terms.
-| Term | Definition | Scope | Discouraged synonyms |
-|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-------------------------------------------------|
-| Node | An individual server that runs GitLab either with a specific role or as a whole (for example a Rails application node). In a cloud context this can be a specific machine type. | GitLab | instance, server |
-| Site | One or a collection of nodes running a single GitLab application. A site can be single-node or multi-node. | GitLab | deployment, installation instance |
-| Single-node site | A specific configuration of GitLab that uses exactly one node. | GitLab | single-server, single-instance
-| Multi-node site | A specific configuration of GitLab that uses more than one node. | GitLab | multi-server, multi-instance, high availability |
-| Primary site | A GitLab site whose data is being replicated by at least one secondary site. There can only be a single primary site. | Geo-specific | Geo deployment, Primary node |
-| Secondary site | A GitLab site that is configured to replicate the data of a primary site. There can be one or more secondary sites. | Geo-specific | Geo deployment, Secondary node |
-| Geo deployment | A collection of two or more GitLab sites with exactly one primary site being replicated by one or more secondary sites. | Geo-specific | |
-| Reference architecture | A [specified configuration of GitLab for a number of users](../reference_architectures/index.md), possibly including multiple nodes and multiple sites. | GitLab | |
-| Promoting | Changing the role of a site from secondary to primary. | Geo-specific | |
-| Demoting | Changing the role of a site from primary to secondary. | Geo-specific | |
-| Failover | The entire process that shifts users from a primary Site to a secondary site. This includes promoting a secondary, but contains other parts as well. For example, scheduling maintenance. | Geo-specific | |
+| Term | Definition | Scope | Discouraged synonyms |
+|------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-------------------------------------------------|
+| Node | An individual server that runs GitLab either with a specific role or as a whole (for example a Rails application node). In a cloud context this can be a specific machine type. | GitLab | instance, server |
+| Site | One or a collection of nodes running a single GitLab application. A site can be single-node or multi-node. | GitLab | deployment, installation instance |
+| Single-node site | A specific configuration of GitLab that uses exactly one node. | GitLab | single-server, single-instance |
+| Multi-node site | A specific configuration of GitLab that uses more than one node. | GitLab | multi-server, multi-instance, high availability |
+| Primary site | A GitLab site whose data is being replicated by at least one secondary site. There can only be a single primary site. | Geo-specific | Geo deployment, Primary node |
+| Secondary site | A GitLab site that is configured to replicate the data of a primary site. There can be one or more secondary sites. | Geo-specific | Geo deployment, Secondary node |
+| Geo deployment | A collection of two or more GitLab sites with exactly one primary site being replicated by one or more secondary sites. | Geo-specific | |
+| Reference architecture | A [specified configuration of GitLab for a number of users](../reference_architectures/index.md), possibly including multiple nodes and multiple sites. | GitLab | |
+| Promoting | Changing the role of a site from secondary to primary. | Geo-specific | |
+| Demoting | Changing the role of a site from primary to secondary. | Geo-specific | |
+| Failover | The entire process that shifts users from a primary Site to a secondary site. This includes promoting a secondary, but contains other parts as well. For example, scheduling maintenance. | Geo-specific | |
+| Replication | Also called "synchronization". The uni-directional process that updates a resource on a secondary site to match the resource on the primary site. | Geo-specific | |
## Examples
diff --git a/doc/administration/geo/index.md b/doc/administration/geo/index.md
index ae2cc262160..0ab24cc4fb8 100644
--- a/doc/administration/geo/index.md
+++ b/doc/administration/geo/index.md
@@ -203,6 +203,7 @@ This list of limitations only reflects the latest version of GitLab. If you are
- [Disaster recovery](disaster_recovery/index.md) for multi-secondary sites causes downtime due to the complete re-synchronization and re-configuration of all non-promoted secondaries.
- For Git over SSH, to make the project clone URL display correctly regardless of which site you are browsing, secondary sites must use the same port as the primary. [GitLab issue #339262](https://gitlab.com/gitlab-org/gitlab/-/issues/339262) proposes to remove this limitation.
- Git push over SSH against a secondary site does not work for pushes over 1.86 GB. [GitLab issue #413109](https://gitlab.com/gitlab-org/gitlab/-/issues/413109) tracks this bug.
+- Backups [cannot be run on secondaries](replication/troubleshooting.md#message-error-canceling-statement-due-to-conflict-with-recovery).
### Limitations on replication/verification
@@ -217,7 +218,7 @@ If you try to view replication data on the primary site, you receive a warning t
The only way to view projects replication data for a particular secondary site is to visit that secondary site directly. For example, `https://<IP of your secondary site>/admin/geo/replication/projects`.
An [epic exists](https://gitlab.com/groups/gitlab-org/-/epics/4623) to fix this limitation.
-Keep in mind that mentioned URLs don't work when [Admin Mode](../../user/admin_area/settings/sign_in_restrictions.md#admin-mode) is enabled.
+Keep in mind that mentioned URLs don't work when [Admin Mode](../settings/sign_in_restrictions.md#admin-mode) is enabled.
When using Unified URL, visiting the secondary site directly means you must route your requests to the secondary site. Exactly how this might be done depends on your networking configuration. If using DNS to route requests to the appropriate site, then you can, for example, edit your local machine's `/etc/hosts` file to route your requests to the desired secondary site. If the Geo sites are all behind a load balancer, then depending on the load balancer, you might be able to configure all requests from your IP to go to a particular secondary site.
diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md
index 18d0440965e..a5d85187812 100644
--- a/doc/administration/geo/replication/configuration.md
+++ b/doc/administration/geo/replication/configuration.md
@@ -190,7 +190,7 @@ keys must be manually replicated to the **secondary** site.
```ruby
##
## The unique identifier for the Geo site. See
- ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings
+ ## https://docs.gitlab.com/ee/administration/geo_nodes.html#common-settings
##
gitlab_rails['geo_node_name'] = '<site_name_here>'
```
@@ -332,7 +332,7 @@ the **primary** site. After you sign in:
1. Verify that it's correctly identified as a **secondary** Geo site, and that
Geo is enabled.
-The initial replication may take some time. The status of the site or the ‘backfill’ may still in progress. You
+The initial replication may take some time. The status of the site or the 'backfill' may still in progress. You
can monitor the synchronization process on each Geo site from the **primary**
site's **Geo Sites** dashboard in your browser.
diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md
index f038bfd705c..b25700ccd29 100644
--- a/doc/administration/geo/replication/datatypes.md
+++ b/doc/administration/geo/replication/datatypes.md
@@ -80,10 +80,8 @@ on a machine:
- With multiple disks mounted as a single mount-point (like with a RAID array).
- Using LVM.
-GitLab does not require a special file system and can work with:
-
-- NFS.
-- A mounted Storage Appliance (there may be performance limitations when using a remote file system).
+GitLab does not require a special file system and can work with a mounted Storage Appliance. However, there can be
+performance limitations and consistency issues when using a remote file system.
Geo triggers garbage collection in Gitaly to [deduplicate forked repositories](../../../development/git_object_deduplication.md#git-object-deduplication-and-gitlab-geo) on Geo secondary sites.
@@ -111,7 +109,7 @@ GitLab stores files and blobs such as Issue attachments or LFS objects into eith
- The file system in a specific location.
- An [Object Storage](../../object_storage.md) solution. Object Storage solutions can be:
- - Cloud based like Amazon S3 Google Cloud Storage.
+ - Cloud based like Amazon S3 and Google Cloud Storage.
- Hosted by you (like MinIO).
- A Storage Appliance that exposes an Object Storage-compatible API.
@@ -192,7 +190,7 @@ successfully, you must replicate their data using some other means.
|:--------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------|:---------------------------------------------------------------------------|:--------------------------------------------------------------------|:----------------------------------------------------------------|:------|
|[Application data in PostgreSQL](../../postgresql/index.md) | **Yes** (10.2) | **Yes** (10.2) | N/A | N/A | |
|[Project repository](../../../user/project/repository/index.md) | **Yes** (10.2) | **Yes** (10.7) | N/A | N/A | |
-|[Project wiki repository](../../../user/project/wiki/index.md) | **Yes** (10.2)<sup>2</sup> | **Yes** (10.7)<sup>2</sup> | N/A | N/A | Migrated to [self-service framework](../../../development/geo/framework.md) in 15.11. See GitLab issue [#367925](https://gitlab.com/gitlab-org/gitlab/-/issues/367925) for more details.<br /><br />Behind feature flag geo_project_wiki_repository_replication, enabled by default in (15.11). |
+|[Project wiki repository](../../../user/project/wiki/index.md) | **Yes** (10.2)<sup>2</sup> | **Yes** (10.7)<sup>2</sup> | N/A | N/A | Migrated to [self-service framework](../../../development/geo/framework.md) in 15.11. See GitLab issue [#367925](https://gitlab.com/gitlab-org/gitlab/-/issues/367925) for more details.<br /><br />Behind feature flag `geo_project_wiki_repository_replication`, enabled by default in (15.11). |
|[Group wiki repository](../../../user/project/wiki/group.md) | [**Yes** (13.10)](https://gitlab.com/gitlab-org/gitlab/-/issues/208147) | No | N/A | N/A | Behind feature flag `geo_group_wiki_repository_replication`, enabled by default. |
|[Uploads](../../uploads.md) | **Yes** (10.2) | **Yes** (14.6) | [**Yes** (15.1)](https://gitlab.com/groups/gitlab-org/-/epics/5551) | [No](object_storage.md#verification-of-files-in-object-storage) | Replication is behind the feature flag `geo_upload_replication`, enabled by default. Verification was behind the feature flag `geo_upload_verification`, removed in 14.8. |
|[LFS objects](../../lfs/index.md) | **Yes** (10.2) | **Yes** (14.6) | [**Yes** (15.1)](https://gitlab.com/groups/gitlab-org/-/epics/5551) | [No](object_storage.md#verification-of-files-in-object-storage) | GitLab versions 11.11.x and 12.0.x are affected by [a bug that prevents any new LFS objects from replicating](https://gitlab.com/gitlab-org/gitlab/-/issues/32696).<br /><br />Replication is behind the feature flag `geo_lfs_object_replication`, enabled by default. Verification was behind the feature flag `geo_lfs_object_verification`, removed in 14.7. |
@@ -203,7 +201,7 @@ successfully, you must replicate their data using some other means.
|[CI Secure Files](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/ci/secure_file.rb) | [**Yes** (15.3)](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/91430) | [**Yes** (15.3)](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/91430) | [**Yes** (15.3)](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/91430) | [No](object_storage.md#verification-of-files-in-object-storage) | Verification is behind the feature flag `geo_ci_secure_file_replication`, enabled by default in 15.3. |
|[Container Registry](../../packages/container_registry.md) | **Yes** (12.3)<sup>1</sup> | **Yes** (15.10) | **Yes** (12.3)<sup>1</sup> | **Yes** (15.10) | See [instructions](container_registry.md) to set up the Container Registry replication. |
|[Terraform Module Registry](../../../user/packages/terraform_module_registry/index.md) | **Yes** (14.0) | **Yes** (14.0) | [**Yes** (15.1)](https://gitlab.com/groups/gitlab-org/-/epics/5551) | [No](object_storage.md#verification-of-files-in-object-storage) | Behind feature flag `geo_package_file_replication`, enabled by default. |
-|[Project designs repository](../../../user/project/issues/design_management.md) | **Yes** (12.7) | **Yes** (16.1) | N/A | N/A | Designs also require replication of LFS objects and Uploads. Replication is behind the feature flag geo_design_management_repository_replication, enabled by default.|
+|[Project designs repository](../../../user/project/issues/design_management.md) | **Yes** (12.7) | **Yes** (16.1) | N/A | N/A | Designs also require replication of LFS objects and Uploads. Replication is behind the feature flag `geo_design_management_repository_replication`, enabled by default.|
|[Package Registry](../../../user/packages/package_registry/index.md) | **Yes** (13.2) | **Yes** (13.10) | [**Yes** (15.1)](https://gitlab.com/groups/gitlab-org/-/epics/5551) | [No](object_storage.md#verification-of-files-in-object-storage) | Behind feature flag `geo_package_file_replication`, enabled by default. |
|[Versioned Terraform State](../../terraform_state.md) | **Yes** (13.5) | **Yes** (13.12) | [**Yes** (15.1)](https://gitlab.com/groups/gitlab-org/-/epics/5551) | [No](object_storage.md#verification-of-files-in-object-storage) | Replication is behind the feature flag `geo_terraform_state_version_replication`, enabled by default. Verification was behind the feature flag `geo_terraform_state_version_verification`, which was removed in 14.0. |
|[External merge request diffs](../../merge_request_diffs.md) | **Yes** (13.5) | **Yes** (14.6) | [**Yes** (15.1)](https://gitlab.com/groups/gitlab-org/-/epics/5551) | [No](object_storage.md#verification-of-files-in-object-storage) | Replication is behind the feature flag `geo_merge_request_diff_replication`, enabled by default. Verification was behind the feature flag `geo_merge_request_diff_verification`, removed in 14.7.|
diff --git a/doc/administration/geo/replication/location_aware_git_url.md b/doc/administration/geo/replication/location_aware_git_url.md
index 4a3f9c86041..a3abc945288 100644
--- a/doc/administration/geo/replication/location_aware_git_url.md
+++ b/doc/administration/geo/replication/location_aware_git_url.md
@@ -107,7 +107,7 @@ You can customize the:
- SSH remote URL to use the location-aware `git.example.com`. To do so, change the SSH remote URL
host by setting `gitlab_rails['gitlab_ssh_host']` in `gitlab.rb` of web nodes.
- HTTP remote URL as shown in
- [Custom Git clone URL for HTTP(S)](../../../user/admin_area/settings/visibility_and_access_controls.md#customize-git-clone-url-for-https).
+ [Custom Git clone URL for HTTP(S)](../../settings/visibility_and_access_controls.md#customize-git-clone-url-for-https).
## Example Git request handling behavior
diff --git a/doc/administration/geo/replication/multiple_servers.md b/doc/administration/geo/replication/multiple_servers.md
index 4e597a21922..29edac1be83 100644
--- a/doc/administration/geo/replication/multiple_servers.md
+++ b/doc/administration/geo/replication/multiple_servers.md
@@ -67,7 +67,7 @@ The following steps enable a GitLab site to serve as the Geo **primary** site.
```ruby
##
## The unique identifier for the Geo site. See
- ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings
+ ## https://docs.gitlab.com/ee/administration/geo_nodes.html#common-settings
##
gitlab_rails['geo_node_name'] = '<site_name_here>'
@@ -217,7 +217,7 @@ then make the following modifications:
##
## The unique identifier for the Geo site. See
- ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings
+ ## https://docs.gitlab.com/ee/administration/geo_nodes.html#common-settings
##
gitlab_rails['geo_node_name'] = '<site_name_here>'
@@ -318,7 +318,7 @@ application nodes above, with some changes to run only the `sidekiq` service:
##
## The unique identifier for the Geo site. See
- ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings
+ ## https://docs.gitlab.com/ee/administration/geo_nodes.html#common-settings
##
gitlab_rails['geo_node_name'] = '<site_name_here>'
diff --git a/doc/administration/geo/replication/single_sign_on.md b/doc/administration/geo/replication/single_sign_on.md
index 55e77d5657c..15e24cdcefb 100644
--- a/doc/administration/geo/replication/single_sign_on.md
+++ b/doc/administration/geo/replication/single_sign_on.md
@@ -31,6 +31,10 @@ If you have configured SAML on the primary site correctly, then it should work o
### SAML with separate URL with proxying enabled
+NOTE:
+When proxying is enabled, SAML can only be used to sign in the secondary site if your SAML Identity Provider (IdP) allows an
+application to have multiple callback URLs configured. Check with your IdP provider support team to confirm if this is the case.
+
If a secondary site uses a different `external_url` to the primary site, then configure your SAML Identity Provider (IdP) to allow the secondary site's SAML callback URL. For example, to configure Okta:
1. [Sign in to Okta](https://www.okta.com/login/).
diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md
index 4047167e4af..c63480db389 100644
--- a/doc/administration/geo/replication/troubleshooting.md
+++ b/doc/administration/geo/replication/troubleshooting.md
@@ -51,7 +51,7 @@ Geo::MetricsUpdateWorker.new.perform
If it raises an error, then the error is probably also preventing the jobs from completing. If it takes longer than 10 minutes, then there may be a performance issue, and the UI may always show "Unhealthy" even if the status eventually does get updated.
-If it successfully updates the status, then something may be wrong with Sidekiq. Is it running? Do the logs show errors? This job is supposed to be enqueued every minute. It takes an exclusive lease in Redis to ensure that only one of these jobs can run at a time. The primary site updates its status directly in the PostgreSQL database. Secondary sites send an HTTP Post request to the primary site with their status data.
+If it successfully updates the status, then something may be wrong with Sidekiq. Is it running? Do the logs show errors? This job is supposed to be enqueued every minute and might not run if a [job deduplication idempotency](../../sidekiq/sidekiq_troubleshooting.md#clearing-a-sidekiq-job-deduplication-idempotency-key) key was not cleared properly. It takes an exclusive lease in Redis to ensure that only one of these jobs can run at a time. The primary site updates its status directly in the PostgreSQL database. Secondary sites send an HTTP Post request to the primary site with their status data.
A site also shows as "Unhealthy" if certain health checks fail. You can reveal the failure by running the following in the [Rails console](../../operations/rails_console.md) on the affected secondary site:
@@ -240,7 +240,7 @@ This machine's Geo node name matches a database record ... no
```
For more information about recommended site names in the description of the Name field, see
-[Geo Admin Area Common Settings](../../../user/admin_area/geo_sites.md#common-settings).
+[Geo Admin Area Common Settings](../../../administration/geo_sites.md#common-settings).
### Reverify all uploads (or any SSF data type which is verified)
@@ -622,7 +622,7 @@ This happens on wrongly-formatted addresses in `postgresql['md5_auth_cidr_addres
```
To fix this, update the IP addresses in `/etc/gitlab/gitlab.rb` under `postgresql['md5_auth_cidr_addresses']`
-to respect the CIDR format (that is, `1.2.3.4/32`).
+to respect the CIDR format (for example, `10.0.0.1/32`).
### Message: `LOG: invalid IP mask "md5": Name or service not known`
@@ -634,7 +634,7 @@ This happens when you have added IP addresses without a subnet mask in `postgres
```
To fix this, add the subnet mask in `/etc/gitlab/gitlab.rb` under `postgresql['md5_auth_cidr_addresses']`
-to respect the CIDR format (that is, `1.2.3.4/32`).
+to respect the CIDR format (for example, `10.0.0.1/32`).
### Message: `Found data in the gitlabhq_production database!` when running `gitlab-ctl replicate-geo-database`
@@ -1295,7 +1295,7 @@ When [Geo proxying for secondary sites](../secondary_proxy/index.md) is enabled,
Check the NGINX logs for errors similar to this example:
```plaintext
-2022/01/26 00:02:13 [error] 26641#0: *829148 upstream sent too big header while reading response header from upstream, client: 1.2.3.4, server: geo.staging.gitlab.com, request: "POST /users/sign_in HTTP/2.0", upstream: "http://unix:/var/opt/gitlab/gitlab-workhorse/sockets/socket:/users/sign_in", host: "geo.staging.gitlab.com", referrer: "https://geo.staging.gitlab.com/users/sign_in"
+2022/01/26 00:02:13 [error] 26641#0: *829148 upstream sent too big header while reading response header from upstream, client: 10.0.2.2, server: geo.staging.gitlab.com, request: "POST /users/sign_in HTTP/2.0", upstream: "http://unix:/var/opt/gitlab/gitlab-workhorse/sockets/socket:/users/sign_in", host: "geo.staging.gitlab.com", referrer: "https://geo.staging.gitlab.com/users/sign_in"
```
To resolve this issue:
@@ -1345,15 +1345,8 @@ To fix this issue, set the primary site's internal URL to a URL that is:
- Unique to the primary site.
- Accessible from all secondary sites.
-1. Enter the [Rails console](../../operations/rails_console.md) on the primary site.
-
-1. Run the following, replacing `https://unique.url.for.primary.site` with your specific internal URL.
- For example, depending on your network configuration, you could use an IP address, like
- `http://1.2.3.4`.
-
- ```ruby
- GeoNode.where(primary: true).first.update!(internal_url: "https://unique.url.for.primary.site")
- ```
+1. Visit the primary site.
+1. [Set up the internal URLs](../../../administration/geo_sites.md#set-up-the-internal-urls).
### Secondary site returns `Received HTTP code 403 from proxy after CONNECT`
@@ -1404,6 +1397,27 @@ In this case, make sure to update the changed URL on all your sites:
1. On the left sidebar, select **Geo > Sites**.
1. Change the URL and save the change.
+### Message: `ERROR: canceling statement due to conflict with recovery` during backup
+
+Running a backup on a Geo **secondary** [is not supported](https://gitlab.com/gitlab-org/gitlab/-/issues/211668).
+
+When running a backup on a **secondary** you might encounter the following error message:
+
+```plaintext
+Dumping PostgreSQL database gitlabhq_production ...
+pg_dump: error: Dumping the contents of table "notes" failed: PQgetResult() failed.
+pg_dump: error: Error message from server: ERROR: canceling statement due to conflict with recovery
+DETAIL: User query might have needed to see row versions that must be removed.
+pg_dump: error: The command was: COPY public.notes (id, note, [...], last_edited_at) TO stdout;
+```
+
+To prevent a database backup being made automatically during GitLab upgrades on your Geo **secondaries**,
+create the following empty file:
+
+```shell
+sudo touch /etc/gitlab/skip-auto-backup
+```
+
## Fixing non-PostgreSQL replication failures
If you notice replication failures in `Admin > Geo > Sites` or the [Sync status Rake task](#sync-status-rake-task), you can try to resolve the failures with the following general steps:
@@ -1665,7 +1679,7 @@ Repository check failures on a Geo secondary site do not necessarily imply a rep
1. Find affected repositories as mentioned below, as well as their [logged errors](../../repository_checks.md#what-to-do-if-a-check-failed).
1. Try to diagnose specific `git fsck` errors. The range of possible errors is wide, try putting them into search engines.
-1. Test normal functions of the affected repositories. Pull from the secondary, view the files.
+1. Test typical functions of the affected repositories. Pull from the secondary, view the files.
1. Check if the primary site's copy of the repository has an identical `git fsck` error. If you are planning a failover, then consider prioritizing that the secondary site has the same information that the primary site has. Ensure you have a backup of the primary, and follow [planned failover guidelines](../disaster_recovery/planned_failover.md).
1. Push to the primary and check if the change gets replicated to the secondary site.
1. If replication is not automatically working, try to manually sync the repository.
@@ -1806,3 +1820,30 @@ If the output differs on some hosts, PostgreSQL replication does not work proper
A full index rebuild is required if the on-disk data is transferred 'at rest' to an operating system with an incompatible locale, or through replication.
This check is also required when using a mixture of GitLab deployments. The locale might be different between an Linux package install, a GitLab Docker container, a Helm chart deployment, or external database services.
+
+## Investigate causes of database replication lag
+
+If the output of `sudo gitlab-rake geo:status` shows that `Database replication lag` remains significantly high over time, the primary node in database replication can be checked to determine the status of lag for
+different parts of the database replication process. These values are known as `write_lag`, `flush_lag`, and `replay_lag`. For more information, see
+[the official PostgreSQL documentation](https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-VIEW).
+
+Run the following command from the primary Geo node's database to provide relevant output:
+
+```shell
+gitlab-psql -xc 'SELECT write_lag,flush_lag,replay_lag FROM pg_stat_replication;'
+
+-[ RECORD 1 ]---------------
+write_lag | 00:00:00.072392
+flush_lag | 00:00:00.108168
+replay_lag | 00:00:00.108283
+```
+
+If one or more of these values is significantly high, this could indicate a problem and should be investigated further. When determining the cause, consider that:
+
+- `write_lag` indicates the time since when WAL bytes have been sent by the primary, then received to the secondary, but not yet flushed or applied.
+- A high `write_lag` value may indicate degraded network performance or insufficient network speed between the primary and secondary nodes.
+- A high `flush_lag` value may indicate degraded or sub-optimal disk I/O performance with the secondary node's storage device.
+- A high `replay_lag` value may indicate long running transactions in PostgreSQL, or the saturation of a needed resource like the CPU.
+- The difference in time between `write_lag` and `flush_lag` indicates that WAL bytes have been sent to the underlying storage system, but it has not reported that they were flushed.
+ This data is most likely not fully written to a persistent storage, and likely held in some kind of volatile write cache.
+- The difference between `flush_lag` and `replay_lag` indicates WAL bytes that have been successfully persisted to storage, but could not be replayed by the database system.
diff --git a/doc/administration/geo/replication/upgrading_the_geo_sites.md b/doc/administration/geo/replication/upgrading_the_geo_sites.md
index 644232a2c9e..ce0ad736071 100644
--- a/doc/administration/geo/replication/upgrading_the_geo_sites.md
+++ b/doc/administration/geo/replication/upgrading_the_geo_sites.md
@@ -33,7 +33,7 @@ and all **secondary** sites:
1. SSH into each node of the **primary** site.
1. [Upgrade GitLab on the **primary** site](../../../update/package/index.md#upgrade-using-the-official-repositories).
1. Perform testing on the **primary** site, particularly if you paused replication in step 1 to protect DR. [There are some suggestions for post-upgrade testing](../../../update/plan_your_upgrade.md#pre-upgrade-and-post-upgrade-checks) in the upgrade documentation.
-1. Ensure that the secrets in the `/etc/gitlab/gitlab-secrets.json` file of both the primary site and the secondary site are the same. The file must be the same on all of a site’s nodes.
+1. Ensure that the secrets in the `/etc/gitlab/gitlab-secrets.json` file of both the primary site and the secondary site are the same. The file must be the same on all of a site's nodes.
1. SSH into each node of **secondary** sites.
1. [Upgrade GitLab on each **secondary** site](../../../update/package/index.md#upgrade-using-the-official-repositories).
1. If you paused replication in step 1, [resume replication on each **secondary**](../index.md#pausing-and-resuming-replication).
diff --git a/doc/administration/geo/secondary_proxy/index.md b/doc/administration/geo/secondary_proxy/index.md
index 3081d1c2485..2ab96a3d33d 100644
--- a/doc/administration/geo/secondary_proxy/index.md
+++ b/doc/administration/geo/secondary_proxy/index.md
@@ -131,6 +131,10 @@ and cannot be configured per Geo site. Therefore, all runners clone from the pri
which Geo site they register on. For information about GitLab CI using a specific Geo secondary to clone from, see issue
[3294](https://gitlab.com/gitlab-org/gitlab/-/issues/3294#note_1009488466).
+- When secondary proxying is used together with separate URLs,
+ [signing in the secondary site using SAML](../replication/single_sign_on.md#saml-with-separate-url-with-proxying-enabled)
+ is only supported if the SAML Identity Provider (IdP) allows an application to be configured with multiple callback URLs.
+
## Behavior of secondary sites when the primary Geo site is down
Considering that web traffic is proxied to the primary, the behavior of the secondary sites differs when the primary
diff --git a/doc/administration/geo/setup/database.md b/doc/administration/geo/setup/database.md
index 31d0fbdffe0..be6e327732d 100644
--- a/doc/administration/geo/setup/database.md
+++ b/doc/administration/geo/setup/database.md
@@ -75,7 +75,7 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o
```ruby
##
## The unique identifier for the Geo site. See
- ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings
+ ## https://docs.gitlab.com/ee/administration/geo_nodes.html#common-settings
##
gitlab_rails['geo_node_name'] = '<site_name_here>'
```
@@ -193,8 +193,8 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o
| `postgresql['md5_auth_cidr_addresses']` | **Primary** and **Secondary** sites' public or VPC private addresses. |
If you are using Google Cloud Platform, SoftLayer, or any other vendor that
- provides a virtual private cloud (VPC), you can use the **primary** and **secondary** sites'
- private addresses (corresponds to "internal address" for Google Cloud Platform) for
+ provides a virtual private cloud (VPC), we recommend using the **primary**
+ and **secondary** sites' "private" or "internal" addresses for
`postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`.
The `listen_address` option opens PostgreSQL up to network connections with the interface
@@ -468,7 +468,8 @@ data before running `pg_basebackup`.
sudo -i
```
-1. Choose a database-friendly name to use for your **secondary** site to
+1. Choose a [database-friendly name](https://www.postgresql.org/docs/13/warm-standby.html#STREAMING-REPLICATION-SLOTS-MANIPULATION)
+ to use for your **secondary** site to
use as the replication slot name. For example, if your domain is
`secondary.geo.example.com`, use `secondary_example` as the slot
name as shown in the commands below.
diff --git a/doc/administration/geo/setup/external_database.md b/doc/administration/geo/setup/external_database.md
index 81541d1cb9c..50383546da3 100644
--- a/doc/administration/geo/setup/external_database.md
+++ b/doc/administration/geo/setup/external_database.md
@@ -39,7 +39,7 @@ developed and tested. We aim to be compatible with most external
##
## The unique identifier for the Geo site. See
- ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings
+ ## https://docs.gitlab.com/ee/administration/geo_nodes.html#common-settings
##
gitlab_rails['geo_node_name'] = '<site_name_here>'
```
diff --git a/doc/administration/geo/setup/index.md b/doc/administration/geo/setup/index.md
index 359f706a8aa..8ac64a963bb 100644
--- a/doc/administration/geo/setup/index.md
+++ b/doc/administration/geo/setup/index.md
@@ -34,8 +34,8 @@ If both Geo sites are based on the [1K reference architecture](../../reference_a
- [Using Linux package PostgreSQL instances](database.md) .
- [Using external PostgreSQL instances](external_database.md)
1. [Configure GitLab](../replication/configuration.md) to set the **primary** and **secondary** sites.
-1. Recommended: [Configure unified URLs](../secondary_proxy/index.md) to use a single, unified URL for all Geo sites.
-1. Optional: [Configure Object storage replication](../../object_storage.md)
+1. Recommended: [Configure unified URLs](../secondary_proxy/index.md#set-up-a-unified-url-for-geo-sites) to use a single, unified URL for all Geo sites.
+1. Optional: [Configure Object storage replication](../replication/object_storage.md)
1. Optional: [Configure a secondary LDAP server](../../auth/ldap/index.md) for the **secondary** sites. See [notes on LDAP](../index.md#ldap).
1. Optional: [Configure Container Registry for the secondary site](../replication/container_registry.md).
1. Follow the [Using a Geo Site](../replication/usage.md) guide.