diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2022-05-19 10:33:21 +0300 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2022-05-19 10:33:21 +0300 |
commit | 36a59d088eca61b834191dacea009677a96c052f (patch) | |
tree | e4f33972dab5d8ef79e3944a9f403035fceea43f /doc/administration/geo/replication | |
parent | a1761f15ec2cae7c7f7bbda39a75494add0dfd6f (diff) |
Add latest changes from gitlab-org/gitlab@15-0-stable-eev15.0.0-rc42
Diffstat (limited to 'doc/administration/geo/replication')
4 files changed, 136 insertions, 18 deletions
diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md index 16b4848e6d3..adf89c24a4e 100644 --- a/doc/administration/geo/replication/configuration.md +++ b/doc/administration/geo/replication/configuration.md @@ -129,7 +129,7 @@ keys must be manually replicated to the **secondary** site. ```shell chown root:root /etc/ssh/ssh_host_*_key* - chmod 0600 /etc/ssh/ssh_host_*_key* + chmod 0600 /etc/ssh/ssh_host_*_key ``` 1. To verify key fingerprint matches, execute the following command on both primary and secondary nodes on each site: @@ -241,15 +241,60 @@ that the **secondary** site can act on those notifications immediately. Be sure the _secondary_ site is running and accessible. You can sign in to the _secondary_ site with the same credentials as were used with the _primary_ site. -### Step 4. (Optional) Configuring the **secondary** site to trust the **primary** site +### Step 4. (Optional) Using custom certificates -You can safely skip this step if your **primary** site uses a CA-issued HTTPS certificate. +You can safely skip this step if: -If your **primary** site is using a self-signed certificate for *HTTPS* support, you -need to add that certificate to the **secondary** site's trust store. Retrieve the -certificate from the **primary** site and follow -[these instructions](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates) -on the **secondary** site. +- Your **primary** site uses a public CA-issued HTTPS certificate. +- Your **primary** site only connects to external services with CA-issued (not self-signed) HTTPS certificates. + +#### Custom or self-signed certificate for inbound connections + +If your GitLab Geo **primary** site uses a custom or [self-signed certificate to secure inbound HTTPS connections](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates), this certificate can either be single-domain certificate or multi-domain. + +Install the correct certificate based on your certificate type: + +- **Multi-domain certificate** that includes both primary and secondary site domains: Install the certificate at `/etc/gitlab/ssl` on all **Rails, Sidekiq, and Gitaly** nodes in the **secondary** site. +- **Single-domain certificate** where the certificates are specific to each Geo site domain: Generate a valid certificate for your **secondary** site's domain and install it at `/etc/gitlab/ssl` per [these instructions](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates) on all **Rails, Sidekiq, and Gitaly** nodes in the **secondary** site. + +#### Connecting to external services that use customer certificates + +A copy of the self-signed certificate for the external service needs to be added to the trust store on all the **primary** site's nodes that require access to the service. + +For the **secondary** site to be able to access the same external services, these certificates *must* be added to the **secondary** site's trust store. + +If your **primary** site is using a [custom or self-signed certificate for inbound HTTPS connections](#custom-or-self-signed-certificate-for-inbound-connections), the **primary** site's certificate needs to be added to the **secondary** site's trust store: + +1. SSH into each **Rails, Sidekiq, and Gitaly node on your secondary** site and login as root: + + ```shell + sudo -i + ``` + +1. Copy the trusted certs from the **primary** site: + + If you can access one of the nodes on your **primary** site serving SSH traffic using the root user: + + ```shell + scp root@<primary_site_node_fqdn>:/etc/gitlab/trusted-certs/* /etc/gitlab/trusted-certs + ``` + + If you only have access through a user with sudo privileges: + + ```shell + # Run this from the node on your primary site: + sudo tar --transform 's/.*\///g' -zcvf ~/geo-trusted-certs.tar.gz /etc/gitlab/trusted-certs/* + + # Run this on each node on your secondary site: + scp <user_with_sudo>@<primary_site_node_fqdn>:geo-trusted-certs.tar.gz . + tar zxvf ~/geo-trusted-certs.tar.gz -C /etc/gitlab/trusted-certs + ``` + +1. Reconfigure each updated **Rails, Sidekiq, and Gitaly node in your secondary** site: + + ```shell + sudo gitlab-ctl reconfigure + ``` ### Step 5. Enable Git access over HTTP/HTTPS diff --git a/doc/administration/geo/replication/multiple_servers.md b/doc/administration/geo/replication/multiple_servers.md index 87b1aa7fc44..7b800817461 100644 --- a/doc/administration/geo/replication/multiple_servers.md +++ b/doc/administration/geo/replication/multiple_servers.md @@ -119,7 +119,7 @@ NOTE: [NFS](../../nfs.md) can be used in place of Gitaly but is not recommended. -### Step 2: Configure Postgres streaming replication +### Step 2: Configure PostgreSQL streaming replication Follow the [Geo database replication instructions](../setup/database.md). @@ -261,7 +261,7 @@ nodes connect to the databases. NOTE: Make sure that current node's IP is listed in `postgresql['md5_auth_cidr_addresses']` setting of the read-replica database to -allow Rails on this node to connect to Postgres. +allow Rails on this node to connect to PostgreSQL. After making these changes [Reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md index 871d6041066..5a29c5a3c54 100644 --- a/doc/administration/geo/replication/troubleshooting.md +++ b/doc/administration/geo/replication/troubleshooting.md @@ -419,6 +419,21 @@ sudo gitlab-ctl reconfigure To help us resolve this problem, consider commenting on [the issue](https://gitlab.com/gitlab-org/gitlab/-/issues/4489). +### Message: `FATAL: could not connect to the primary server: server certificate for "PostgreSQL" does not match host name` + +This happens because the PostgreSQL certificate that the Omnibus GitLab package automatically creates contains +the Common Name `PostgreSQL`, but the replication is connecting to a different host and GitLab attempts to use +the `verify-full` SSL mode by default. + +In order to fix this, you can either: + +- Use the `--sslmode=verify-ca` argument with the `replicate-geo-database` command. +- For an already replicated database, change `sslmode=verify-full` to `sslmode=verify-ca` + in `/var/opt/gitlab/postgresql/data/gitlab-geo.conf` and run `gitlab-ctl restart postgresql`. +- [Configure SSL for PostgreSQL](https://docs.gitlab.com/omnibus/settings/database.html#configuring-ssl) + with a custom certificate (including the host name that's used to connect to the database in the CN or SAN) + instead of using the automatically generated certificate. + ### Message: `LOG: invalid CIDR mask in address` This happens on wrongly-formatted addresses in `postgresql['md5_auth_cidr_addresses']`. @@ -637,9 +652,9 @@ to start again from scratch, there are a few steps that can help you: 1. Reset the Tracking Database. ```shell - gitlab-rake geo:db:drop # on a secondary app node - gitlab-ctl reconfigure # on the tracking database node - gitlab-rake geo:db:setup # on a secondary app node + gitlab-rake db:drop:geo # on a secondary app node + gitlab-ctl reconfigure # on the tracking database node + gitlab-rake db:migrate:geo # on a secondary app node ``` 1. Restart previously stopped services. @@ -977,7 +992,7 @@ On the **primary** node: 1. On the left sidebar, select **Geo > Nodes**. 1. Find the affected **secondary** site and select **Edit**. 1. Ensure the **URL** field matches the value found in `/etc/gitlab/gitlab.rb` - in `external_url "https://gitlab.example.com"` on the frontend server(s) of + in `external_url "https://gitlab.example.com"` on the frontend servers of the **secondary** node. ## Fixing common errors @@ -1042,7 +1057,7 @@ Make sure you follow the [Geo database replication](../setup/database.md) instru If you are using Omnibus GitLab installation, something might have failed during upgrade. You can: - Run `sudo gitlab-ctl reconfigure`. -- Manually trigger the database migration by running: `sudo gitlab-rake geo:db:migrate` as root on the **secondary** node. +- Manually trigger the database migration by running: `sudo gitlab-rake db:migrate:geo` as root on the **secondary** node. ### GitLab indicates that more than 100% of repositories were synced @@ -1101,12 +1116,70 @@ This is due to [Pages data not being managed by Geo](datatypes.md#limitations-on Find advice to resolve those error messages in the [Pages administration documentation](../../../administration/pages/index.md#404-error-after-promoting-a-geo-secondary-to-a-primary-node). +### Primary site returns 500 error when accessing `/admin/geo/replication/projects` + +Navigating to **Admin > Geo > Replication** (or `/admin/geo/replication/projects`) on a primary Geo site, shows a 500 error, while that same link on the secondary works fine. The primary's `production.log` has a similar entry to the following: + +```plaintext +Geo::TrackingBase::SecondaryNotConfigured: Geo secondary database is not configured + from ee/app/models/geo/tracking_base.rb:26:in `connection' + [..] + from ee/app/views/admin/geo/projects/_all.html.haml:1 +``` + +On a Geo primary site this error can be ignored. + +This happens because GitLab is attempting to display registries from the [Geo tracking database](../../../administration/geo/#geo-tracking-database) which doesn't exist on the primary site (only the original projects exist on the primary; no replicated projects are present, therefore no tracking database exists). + ## Fixing client errors -### Authorization errors from LFS HTTP(s) client requests +### Authorization errors from LFS HTTP(S) client requests You may have problems if you're running a version of [Git LFS](https://git-lfs.github.com/) before 2.4.2. As noted in [this authentication issue](https://github.com/git-lfs/git-lfs/issues/3025), requests redirected from the secondary to the primary node do not properly send the Authorization header. This may result in either an infinite `Authorization <-> Redirect` loop, or Authorization error messages. + +## Recovering from a partial failover + +The partial failover to a secondary Geo *site* may be the result of a temporary/transient issue. Therefore, first attempt to run the promote command again. + +1. SSH into every Sidekiq, PostgresSQL, Gitaly, and Rails node in the **secondary** site and run one of the following commands: + + - To promote the secondary node to primary: + + ```shell + sudo gitlab-ctl geo promote + ``` + + - To promote the secondary node to primary **without any further confirmation**: + + ```shell + sudo gitlab-ctl geo promote --force + ``` + +1. Verify you can connect to the newly-promoted **primary** site using the URL used previously for the **secondary** site. +1. If **successful**, the **secondary** site is now promoted to the **primary** site. + +If the above steps are **not successful**, proceed through the next steps: + +1. SSH to every Sidekiq, PostgresSQL, Gitaly and Rails node in the **secondary** site and perform the following operations: + + - Create a `/etc/gitlab/gitlab-cluster.json` file with the following content: + + ```shell + { + "primary": true, + "secondary": false + } + ``` + + - Reconfigure GitLab for the changes to take effect: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +1. Verify you can connect to the newly-promoted **primary** site using the URL used previously for the **secondary** site. +1. If successful, the **secondary** site is now promoted to the **primary** site. diff --git a/doc/administration/geo/replication/version_specific_updates.md b/doc/administration/geo/replication/version_specific_updates.md index b0797445890..6b617a21be8 100644 --- a/doc/administration/geo/replication/version_specific_updates.md +++ b/doc/administration/geo/replication/version_specific_updates.md @@ -12,9 +12,9 @@ for updating Geo sites. ## Updating to 14.9 -**DO NOT** update to GitLab 14.9.0. +**DO NOT** update to GitLab 14.9.0. Instead, use 14.9.1 or later. -We've discovered an issue with Geo's CI verification feature that may [cause job traces to be lost](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/6664). This issue will be fixed in the next patch release. +We've discovered an issue with Geo's CI verification feature that may [cause job traces to be lost](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/6664). This issue was fixed in [the GitLab 14.9.1 patch release](https://about.gitlab.com/releases/2022/03/23/gitlab-14-9-1-released/). If you have already updated to GitLab 14.9.0, you can disable the feature causing the issue by [disabling the `geo_job_artifact_replication` feature flag](../../feature_flags.md#how-to-enable-and-disable-features-behind-flags). |