diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2023-08-18 13:50:51 +0300 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2023-08-18 13:50:51 +0300 |
commit | db384e6b19af03b4c3c82a5760d83a3fd79f7982 (patch) | |
tree | 34beaef37df5f47ccbcf5729d7583aae093cffa0 /doc/administration/geo/setup/two_single_node_sites.md | |
parent | 54fd7b1bad233e3944434da91d257fa7f63c3996 (diff) |
Add latest changes from gitlab-org/gitlab@16-3-stable-eev16.3.0-rc42
Diffstat (limited to 'doc/administration/geo/setup/two_single_node_sites.md')
-rw-r--r-- | doc/administration/geo/setup/two_single_node_sites.md | 638 |
1 files changed, 638 insertions, 0 deletions
diff --git a/doc/administration/geo/setup/two_single_node_sites.md b/doc/administration/geo/setup/two_single_node_sites.md new file mode 100644 index 00000000000..00002d501b2 --- /dev/null +++ b/doc/administration/geo/setup/two_single_node_sites.md @@ -0,0 +1,638 @@ +--- +stage: Systems +group: Geo +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments +--- + +# Set up Geo for two single-node sites **(PREMIUM SELF)** + +The following guide provides concise instructions on how to deploy GitLab Geo for a two single-node site installation using two Linux package instances. + +Prerequisites: + +- You have at least two independently working GitLab sites. + To create the sites, see the [GitLab reference architectures documentation](../../reference_architectures/index.md). + - One GitLab site serves as the **Geo primary site**. You can use different reference architecture sizes for each Geo site. If you already have a working GitLab instance, you can use it as the primary site. + - The second GitLab site serves as the **Geo secondary site**. Geo supports multiple secondary sites. +- The Geo primary site has at least a [GitLab Premium](https://about.gitlab.com/pricing/) license. + You need only one license for all sites. +- Confirm all sites meet the [requirements for running Geo](../index.md#requirements-for-running-geo). + +## Set up Geo for Linux package (Omnibus) + +Prerequisites: + +- You use PostgreSQL 12 or later, + which includes the [`pg_basebackup` tool](https://www.postgresql.org/docs/12/app-pgbasebackup.html). + +### Configure the primary site + +1. SSH into your GitLab primary site and sign in as root: + + ```shell + sudo -i + ``` + +1. Add a unique Geo site name to `/etc/gitlab/gitlab.rb`: + + ```ruby + ## + ## The unique identifier for the Geo site. See + ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings + ## + gitlab_rails['geo_node_name'] = '<site_name_here>' + ``` + +1. To apply the change, reconfigure the primary site: + + ```shell + gitlab-ctl reconfigure + ``` + +1. Define the site as your primary Geo site: + + ```shell + gitlab-ctl set-geo-primary-node + ``` + + This command uses the `external_url` defined in `/etc/gitlab/gitlab.rb`. + +1. Create a password for the `gitlab` database user. + + 1. Generate a MD5 hash of the desired password: + + ```shell + gitlab-ctl pg-password-md5 gitlab + # Enter password: <your_password_here> + # Confirm password: <your_password_here> + # fca0b89a972d69f00eb3ec98a5838484 + ``` + + 1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab` + postgresql['sql_user_password'] = '<md5_hash_of_your_password>' + + # Every node that runs Puma or Sidekiq needs to have the database + # password specified as below. If you have a high-availability setup, this + # must be present in all application nodes. + gitlab_rails['db_password'] = '<your_password_here>' + ``` + +1. Define a password for the database [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication). + Use the username defined in `/etc/gitlab/gitlab.rb` under the `postgresql['sql_replication_user']` + setting. The default value is `gitlab_replicator`. + + 1. Generate an MD5 hash of the desired password: + + ```shell + gitlab-ctl pg-password-md5 gitlab_replicator + + # Enter password: <your_password_here> + # Confirm password: <your_password_here> + # 950233c0dfc2f39c64cf30457c3b7f1e + ``` + + 1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab_replicator` + postgresql['sql_replication_password'] = '<md5_hash_of_your_password>' + ``` + + 1. Optional. If you use an external database not managed by the Linux package, you must + create the `gitlab_replicator` user and define a password for that user manually: + + ```sql + --- Create a new user 'replicator' + CREATE USER gitlab_replicator; + + --- Set/change a password and grants replication privilege + ALTER USER gitlab_replicator WITH REPLICATION ENCRYPTED PASSWORD '<replication_password>'; + ``` + +1. In `/etc/gitlab/gitlab.rb`, set the role to [`geo_primary_role`](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles): + + ```ruby + ## Geo Primary role + roles(['geo_primary_role']) + ``` + +1. Configure PostgreSQL to listen on network interfaces: + + 1. To look up the address of a Geo site, SSH into the Geo site and execute: + + ```shell + ## + ## Private address + ## + ip route get 255.255.255.255 | awk '{print "Private address:", $NF; exit}' + + ## + ## Public address + ## + echo "External address: $(curl --silent "ipinfo.io/ip")" + ``` + + In most cases, the following addresses are used to configure GitLab + Geo: + + | Configuration | Address | + |:----------------------------------------|:----------------------------------------------------------------------| + | `postgresql['listen_address']` | Primary site public or VPC private address. | + | `postgresql['md5_auth_cidr_addresses']` | Primary and secondary site public or VPC private addresses. | + + If you use Google Cloud Platform, SoftLayer, or any other vendor that + provides a virtual private cloud (VPC), you can use the primary and secondary site + private addresses (which correspond to "internal address" for Google Cloud Platform) for + `postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`. + + NOTE: + If you need to use `0.0.0.0` or `*` as the `listen_address`, you also must add + `127.0.0.1/32` to the `postgresql['md5_auth_cidr_addresses']` setting, to allow + Rails to connect through `127.0.0.1`. For more information, see [issue 5258](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5258). + + Depending on your network configuration, the suggested addresses might + be incorrect. If your primary and secondary sites connect over a local + area network, or a virtual network connecting availability zones like + [Amazon's VPC](https://aws.amazon.com/vpc/) or [Google's VPC](https://cloud.google.com/vpc/), + you should use the secondary site private address for `postgresql['md5_auth_cidr_addresses']`. + + 1. Add the following lines to `/etc/gitlab/gitlab.rb`. Be sure to replace the IP + addresses with addresses appropriate to your network configuration: + + ```ruby + ## + ## Primary address + ## - replace '<primary_node_ip>' with the public or VPC address of your Geo primary node + ## + postgresql['listen_address'] = '<primary_site_ip>' + + ## + # Allow PostgreSQL client authentication from the primary and secondary IPs. These IPs may be + # public or VPC addresses in CIDR format, for example ['198.51.100.1/32', '198.51.100.2/32'] + ## + postgresql['md5_auth_cidr_addresses'] = ['<primary_site_ip>/32', '<secondary_site_ip>/32'] + ``` + +1. Disable automatic database migrations temporarily until PostgreSQL is restarted and listening on the private address. + In `/etc/gitlab/gitlab.rb`, set `gitlab_rails['auto_migrate']` to false: + + ```ruby + ## Disable automatic database migrations + gitlab_rails['auto_migrate'] = false + ``` + +1. To apply these changes, reconfigure GitLab and restart PostgreSQL: + + ```shell + gitlab-ctl reconfigure + gitlab-ctl restart postgresql + ``` + +1. To re-enable migrations, edit `/etc/gitlab/gitlab.rb` and change `gitlab_rails['auto_migrate']` to `true`: + + ```ruby + gitlab_rails['auto_migrate'] = true + ``` + + Save the file and reconfigure GitLab: + + ```shell + gitlab-ctl reconfigure + ``` + + The PostgreSQL server is set up to accept remote connections + +1. Run `netstat -plnt | grep 5432` to ensure that PostgreSQL is listening on port + `5432` to the primary site private address. + +1. A certificate was automatically generated when GitLab was reconfigured. The certificate + is used automatically to protect your PostgreSQL traffic from + eavesdroppers. To protect against active ("man-in-the-middle") attackers, + copy the certificate to the secondary site: + + 1. Make a copy of `server.crt` on the primary site: + + ```shell + cat ~gitlab-psql/data/server.crt + ``` + + 1. Save the output for when you configure the secondary site. + The certificate is not sensitive data. + + The certificate is created with a generic `PostgreSQL` common name. + To prevent hostname mismatch errors, you must use the `verify-ca` + mode when replicating the database. + +### Configure the secondary server + +1. SSH into your GitLab secondary site and sign in as root: + + ```shell + sudo -i + ``` + +1. To prevent any commands from running before the site is configured, stop the application server and Sidekiq: + + ```shell + gitlab-ctl stop puma + gitlab-ctl stop sidekiq + ``` + +1. [Check TCP connectivity](../../raketasks/maintenance.md) to the primary site PostgreSQL server: + + ```shell + gitlab-rake gitlab:tcp_check[<primary_site_ip>,5432] + ``` + + If this step fails, you might be using the wrong IP address, or a firewall might + be preventing access to the site. Check the IP address, paying close + attention to the difference between public and private addresses. + If a firewall is present, ensure the secondary site is allowed to connect to the + primary site on port 5432. + +1. In the secondary site, create a file called `server.crt` and add the copy of the certificate you made when you configured the primary site. + + ```shell + editor server.crt + ``` + +1. To set up PostgreSQL TLS verification on the secondary site, install `server.crt`: + + ```shell + install \ + -D \ + -o gitlab-psql \ + -g gitlab-psql \ + -m 0400 \ + -T server.crt ~gitlab-psql/.postgresql/root.crt + ``` + + PostgreSQL now recognizes only this exact certificate when verifying TLS + connections. The certificate can be replicated by someone with access + to the private key, which is present on only the primary site. + +1. Test that the `gitlab-psql` user can connect to the primary site database. + The default Linux package name is `gitlabhq_production`: + + ```shell + sudo \ + -u gitlab-psql /opt/gitlab/embedded/bin/psql \ + --list \ + -U gitlab_replicator \ + -d "dbname=gitlabhq_production sslmode=verify-ca" \ + -W \ + -h <primary_site_ip> + ``` + + When prompted, enter the plaintext password you set for the `gitlab_replicator` user. + If all worked correctly, you should see the list of the primary site databases. + +1. Edit `/etc/gitlab/gitlab.rb` and set the role to `geo_secondary_role`: + + ```ruby + ## + ## Geo Secondary role + ## - configure dependent flags automatically to enable Geo + ## + roles(['geo_secondary_role']) + ``` + + For more information, see [Geo roles](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles). + +1. To configure PostgreSQL, edit `/etc/gitlab/gitlab.rb` and add the following: + + ```ruby + ## + ## Secondary address + ## - replace '<secondary_site_ip>' with the public or VPC address of your Geo secondary site + ## + postgresql['listen_address'] = '<secondary_site_ip>' + postgresql['md5_auth_cidr_addresses'] = ['<secondary_site_ip>/32'] + + ## + ## Database credentials password (defined previously in primary site) + ## - replicate same values here as defined in primary site + ## + postgresql['sql_replication_password'] = '<md5_hash_of_your_password>' + postgresql['sql_user_password'] = '<md5_hash_of_your_password>' + gitlab_rails['db_password'] = '<your_password_here>' + ``` + + Be sure to replace the IP addresses with addresses appropriate to your network configuration. + +1. To apply the changes, reconfigure GitLab: + + ```shell + gitlab-ctl reconfigure + ``` + +1. To apply the IP address change, restart PostgreSQL: + + ```shell + gitlab-ctl restart postgresql + ``` + +### Replicate the database + +Connect the database on the secondary site to +the database on the primary site. +You can use the script below to replicate the +database and create the needed files for streaming replication. + +The script uses the default Linux package directories. +If you changed the defaults, replace the directory and path +names in the script below with your own names. + +WARNING: +Run the replication script on only the secondary site. +The script removes all PostgreSQL data before it runs `pg_basebackup`, +which can lead to data loss. + +To replicate the database: + +1. SSH into your GitLab secondary site and sign in as root: + + ```shell + sudo -i + ``` + +1. Choose a database-friendly name for your secondary site to + use as the replication slot name. For example, if your domain is + `secondary.geo.example.com`, use `secondary_example` as the slot + name. Replication slot names must only contain lowercase letters, + numbers, and the underscore character. + +1. Execute the following command to back up and restore the database, and begin the replication. + + WARNING: + Each Geo secondary site must have its own unique replication slot name. + Using the same slot name between two secondaries breaks PostgreSQL replication. + + ```shell + gitlab-ctl replicate-geo-database \ + --slot-name=<secondary_site_name> \ + --host=<primary_site_ip> \ + --sslmode=verify-ca + ``` + + When prompted, enter the plaintext password you set up for the `gitlab_replicator`. + +The replication process is complete. + +## Configure a new secondary site + +After the replication process is complete, you need to [configure fast lookup of authorized SSH keys](../../operations/fast_ssh_key_lookup.md). + +NOTE: +Authentication is handled by the primary site. Don't set up custom authentication for the secondary site. +Any change that requires access to the Admin Area should be made in the primary site, because the +secondary site is a read-only copy. + +### Manually replicate secret GitLab values + +GitLab stores a number of secret values in `/etc/gitlab/gitlab-secrets.json`. +This JSON file must be the same across each of the site nodes. +You must manually replicate the secret file across all of your secondary sites, although +[issue 3789](https://gitlab.com/gitlab-org/gitlab/-/issues/3789) proposes to change this behavior. + +1. SSH into a Rails node on your primary site, and execute the command below: + + ```shell + sudo cat /etc/gitlab/gitlab-secrets.json + ``` + + This displays the secrets you must replicate, in JSON format. + +1. SSH into each node on your secondary Geo site and sign in as root: + + ```shell + sudo -i + ``` + +1. Make a backup of any existing secrets: + + ```shell + mv /etc/gitlab/gitlab-secrets.json /etc/gitlab/gitlab-secrets.json.`date +%F` + ``` + +1. Copy `/etc/gitlab/gitlab-secrets.json` from the primary site Rails node to each secondary site node. + You can also copy-and-paste the file contents between nodes: + + ```shell + sudo editor /etc/gitlab/gitlab-secrets.json + + # paste the output of the `cat` command you ran on the primary + # save and exit + ``` + +1. Ensure the file permissions are correct: + + ```shell + chown root:root /etc/gitlab/gitlab-secrets.json + chmod 0600 /etc/gitlab/gitlab-secrets.json + ``` + +1. To apply the changes, reconfigure every Rails, Sidekiq and Gitaly secondary site node: + + ```shell + gitlab-ctl reconfigure + gitlab-ctl restart + ``` + +### Manually replicate the primary site SSH host keys + +1. SSH into each node on your secondary site and sign in as root: + + ```shell + sudo -i + ``` + +1. Back up any existing SSH host keys: + + ```shell + find /etc/ssh -iname 'ssh_host_*' -exec cp {} {}.backup.`date +%F` \; + ``` + +1. Copy OpenSSH host keys from the primary site. + + - If you can access as root one of the primary site nodes serving SSH traffic (usually, the main GitLab Rails application nodes): + + ```shell + # Run this from the secondary site, change `<primary_site_fqdn>` for the IP or FQDN of the server + scp root@<primary_node_fqdn>:/etc/ssh/ssh_host_*_key* /etc/ssh + ``` + + - If you only have access through a user with `sudo` privileges: + + ```shell + # Run this from the node on your primary site: + sudo tar --transform 's/.*\///g' -zcvf ~/geo-host-key.tar.gz /etc/ssh/ssh_host_*_key* + + # Run this on each node on your secondary site: + scp <user_with_sudo>@<primary_site_fqdn>:geo-host-key.tar.gz . + tar zxvf ~/geo-host-key.tar.gz -C /etc/ssh + ``` + +1. For each secondary site node, ensure the file permissions are correct: + + ```shell + chown root:root /etc/ssh/ssh_host_*_key* + chmod 0600 /etc/ssh/ssh_host_*_key + ``` + +1. To verify key fingerprint matches, execute the following command on both the primary and secondary nodes on each site: + + ```shell + for file in /etc/ssh/ssh_host_*_key; do ssh-keygen -lf $file; done + ``` + + You should get an output similar to the following: + + ```shell + 1024 SHA256:FEZX2jQa2bcsd/fn/uxBzxhKdx4Imc4raXrHwsbtP0M root@serverhostname (DSA) + 256 SHA256:uw98R35Uf+fYEQ/UnJD9Br4NXUFPv7JAUln5uHlgSeY root@serverhostname (ECDSA) + 256 SHA256:sqOUWcraZQKd89y/QQv/iynPTOGQxcOTIXU/LsoPmnM root@serverhostname (ED25519) + 2048 SHA256:qwa+rgir2Oy86QI+PZi/QVR+MSmrdrpsuH7YyKknC+s root@serverhostname (RSA) + ``` + + The output should be identical on both nodes. + +1. Verify you have the correct public keys for the existing private keys: + + ```shell + # This will print the fingerprint for private keys: + for file in /etc/ssh/ssh_host_*_key; do ssh-keygen -lf $file; done + + # This will print the fingerprint for public keys: + for file in /etc/ssh/ssh_host_*_key.pub; do ssh-keygen -lf $file; done + ``` + + The output for the public and private key commands should generate the same fingerprint. + +1. For each secondary site node, restart `sshd`: + + ```shell + # Debian or Ubuntu installations + sudo service ssh reload + + # CentOS installations + sudo service sshd reload + ``` + +1. To verify SSH is still functional, from a new terminal, SSH into your GitLab secondary server. + If you can't connect, make sure you have the correct permissions. + +### Add the secondary site + +1. SSH into each Rails and Sidekiq node on your secondary site and sign in as root: + + ```shell + sudo -i + ``` + +1. Edit `/etc/gitlab/gitlab.rb` and add a unique name for your site. + + ```ruby + ## + ## The unique identifier for the Geo site. See + ## https://docs.gitlab.com/ee/user/admin_area/geo_nodes.html#common-settings + ## + gitlab_rails['geo_node_name'] = '<site_name_here>' + ``` + + Save the unique name for the next steps. + +1. To apply the changes, reconfigure each Rails and Sidekiq node on your secondary site. + + ```shell + gitlab-ctl reconfigure + ``` + +1. Navigate to the primary node GitLab instance: + 1. On the top bar, select **Main menu > Admin**. + 1. On the left sidebar, select **Geo > Sites**. + 1. Select **Add site**. + + ![Add secondary site](../replication/img/adding_a_secondary_v15_8.png) + + 1. In **Name**, enter the value for `gitlab_rails['geo_node_name']` in + `/etc/gitlab/gitlab.rb`. The values must match exactly. + 1. In **External URL**, enter the value for `external_url` in `/etc/gitlab/gitlab.rb`. + It's okay if one values ends in `/` and the other doesn't. Otherwise, the values must + match exactly. + 1. Optional. In **Internal URL (optional)**, enter an internal URL for the primary site. + 1. Optional. Select which groups or storage shards should be replicated by the + secondary site. To replicate all, leave the field blank. See [selective synchronization](../replication/configuration.md#selective-synchronization). + 1. Select **Save changes**. +1. SSH into each Rails and Sidekiq node on your secondary site and restart the services: + + ```shell + gitlab-ctl restart + ``` + +1. Check if there are any common issues with your Geo setup by running: + + ```shell + gitlab-rake gitlab:geo:check + ``` + + If any of the checks fail, see the [troubleshooting documentation](../replication/troubleshooting.md). + +1. To verify that the secondary site is reachable, SSH into a Rails or Sidekiq server on your primary site and sign in as root: + + ```shell + gitlab-rake gitlab:geo:check + ``` + + If any of the checks fail, check the [troubleshooting documentation](../replication/troubleshooting.md). + +After the secondary site is added to the Geo administration page and restarted, +the site automatically starts to replicate missing data from the primary site +in a process known as backfill. + +Meanwhile, the primary site starts to notify each secondary site of any changes, so +that the secondary site can act on the notifications immediately. + +Be sure the secondary site is running and accessible. You can sign in to the +secondary site with the same credentials as were used with the primary site. + +### Enable Git access over HTTP/HTTPS and SSH + +Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone +method to be enabled. This is enabled by default. +If you convert an existing site to Geo, you should check that the clone method is enabled. + +On the primary site: + +1. On the top bar, select **Main menu > Admin**. +1. On the left sidebar, select **Settings > General**. +1. Expand **Visibility and access controls**. +1. If you use Git over SSH: + 1. Ensure **Enabled Git access protocols** is set to **Both SSH and HTTP(S)**. + 1. Follow [Fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md) on both the primary and secondary sites. +1. If you don't use Git over SSH, set **Enabled Git access protocols** to **Only HTTP(S)**. + +### Verify proper functioning of the secondary site + +You can sign in to the secondary site with the same credentials you used with +the primary site. + +After you sign in: + +1. On the top bar, select **Main menu > Admin**. +1. On the left sidebar, select **Geo > Sites**. +1. Verify that the site is correctly identified as a secondary Geo site, and that + Geo is enabled. + +The initial replication might take some time. +You can monitor the synchronization process on each Geo site from the primary +site **Geo Sites** dashboard in your browser. + +![Geo dashboard](../replication/img/geo_dashboard_v14_0.png) + +## Related topics + +- [Troubleshooting Geo](../replication/troubleshooting.md) |