Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/geo')
-rw-r--r--doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md22
-rw-r--r--doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md22
-rw-r--r--doc/administration/geo/glossary.md4
-rw-r--r--doc/administration/geo/replication/configuration.md4
-rw-r--r--doc/administration/geo/replication/datatypes.md2
-rw-r--r--doc/administration/geo/replication/docker_registry.md12
-rw-r--r--doc/administration/geo/replication/faq.md8
-rw-r--r--doc/administration/geo/replication/location_aware_git_url.md2
-rw-r--r--doc/administration/geo/replication/remove_geo_node.md9
-rw-r--r--doc/administration/geo/replication/security_review.md6
-rw-r--r--doc/administration/geo/replication/troubleshooting.md78
-rw-r--r--doc/administration/geo/replication/updating_the_geo_nodes.md4
-rw-r--r--doc/administration/geo/replication/usage.md2
-rw-r--r--doc/administration/geo/replication/version_specific_updates.md40
-rw-r--r--doc/administration/geo/setup/database.md100
-rw-r--r--doc/administration/geo/setup/external_database.md6
-rw-r--r--doc/administration/geo/setup/index.md18
17 files changed, 229 insertions, 110 deletions
diff --git a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md
index 4cfe781c7a4..16ae5bde062 100644
--- a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md
+++ b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md
@@ -19,7 +19,7 @@ This runbook is in **alpha**. For complete, production-ready documentation, see
| Geo site | Multi-node |
| Secondaries | One |
-This runbook will guide you through a planned failover of a multi-node Geo site
+This runbook guides you through a planned failover of a multi-node Geo site
with one secondary. The following [2000 user reference architecture](../../../../administration/reference_architectures/2k_users.md) is assumed:
```mermaid
@@ -46,7 +46,7 @@ graph TD
The load balancer node and optional NFS server are omitted for clarity.
-This guide will result in the following:
+This guide results in the following:
1. An offline primary.
1. A promoted secondary that is now the new primary.
@@ -76,7 +76,7 @@ On the **secondary** node:
If any objects are failing to replicate, this should be investigated before
scheduling the maintenance window. After a planned failover, anything that
-failed to replicate will be **lost**.
+failed to replicate is **lost**.
You can use the
[Geo status API](../../../../api/geo_nodes.md#retrieve-project-sync-or-verification-failures-that-occurred-on-the-current-node)
@@ -117,10 +117,10 @@ follow these steps to avoid unnecessary data loss:
sudo iptables -A INPUT --tcp-dport 443 -j REJECT
```
- From this point, users will be unable to view their data or make changes on the
- **primary** node. They will also be unable to log in to the **secondary** node.
- However, existing sessions will work for the remainder of the maintenance period, and
- public data will be accessible throughout.
+ From this point, users are unable to view their data or make changes on the
+ **primary** node. They are also unable to log in to the **secondary** node.
+ However, existing sessions need to work for the remainder of the maintenance period, and
+ so public data is accessible throughout.
1. Verify the **primary** node is blocked to HTTP traffic by visiting it in browser via
another IP. The server should refuse connection.
@@ -170,8 +170,8 @@ follow these steps to avoid unnecessary data loss:
1. [Run an integrity check](../../../raketasks/check.md) to verify the integrity
of CI artifacts, LFS objects, and uploads in file storage.
- At this point, your **secondary** node will contain an up-to-date copy of everything the
- **primary** node has, meaning nothing will be lost when you fail over.
+ At this point, your **secondary** node contains an up-to-date copy of everything the
+ **primary** node has, meaning nothing is lost when you fail over.
1. In this final step, you need to permanently disable the **primary** node.
@@ -213,7 +213,7 @@ follow these steps to avoid unnecessary data loss:
- If you do not have SSH access to the **primary** node, take the machine offline and
prevent it from rebooting. Since there are many ways you may prefer to accomplish
- this, we will avoid a single recommendation. You may need to:
+ this, we avoid a single recommendation. You may need to:
- Reconfigure the load balancers.
- Change DNS records (for example, point the **primary** DNS record to the
@@ -248,7 +248,7 @@ issue has been fixed in GitLab 13.4 and later.
WARNING:
If the secondary node [has been paused](../../../geo/index.md#pausing-and-resuming-replication), this performs
a point-in-time recovery to the last known state.
-Data that was created on the primary while the secondary was paused will be lost.
+Data that was created on the primary while the secondary was paused is lost.
1. SSH in to the PostgreSQL node in the **secondary** and promote PostgreSQL separately:
diff --git a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md
index 6caeddad51a..36c9d46d650 100644
--- a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md
+++ b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md
@@ -19,7 +19,7 @@ This runbook is in **alpha**. For complete, production-ready documentation, see
| Geo site | Single-node |
| Secondaries | One |
-This runbook will guide you through a planned failover of a single-node Geo site
+This runbook guides you through a planned failover of a single-node Geo site
with one secondary. The following general architecture is assumed:
```mermaid
@@ -34,7 +34,7 @@ graph TD
end
```
-This guide will result in the following:
+This guide results in the following:
1. An offline primary.
1. A promoted secondary that is now the new primary.
@@ -61,7 +61,7 @@ time to complete.
If any objects are failing to replicate, this should be investigated before
scheduling the maintenance window. After a planned failover, anything that
-failed to replicate will be **lost**.
+failed to replicate is **lost**.
You can use the
[Geo status API](../../../../api/geo_nodes.md#retrieve-project-sync-or-verification-failures-that-occurred-on-the-current-node)
@@ -102,10 +102,10 @@ follow these steps to avoid unnecessary data loss:
sudo iptables -A INPUT --tcp-dport 443 -j REJECT
```
- From this point, users will be unable to view their data or make changes on the
- **primary** node. They will also be unable to log in to the **secondary** node.
- However, existing sessions will work for the remainder of the maintenance period, and
- public data will be accessible throughout.
+ From this point, users are unable to view their data or make changes on the
+ **primary** node. They are also unable to log in to the **secondary** node.
+ However, existing sessions need to work for the remainder of the maintenance period, and
+ so public data is accessible throughout.
1. Verify the **primary** node is blocked to HTTP traffic by visiting it in browser via
another IP. The server should refuse connection.
@@ -155,8 +155,8 @@ follow these steps to avoid unnecessary data loss:
1. [Run an integrity check](../../../raketasks/check.md) to verify the integrity
of CI artifacts, LFS objects, and uploads in file storage.
- At this point, your **secondary** node will contain an up-to-date copy of everything the
- **primary** node has, meaning nothing will be lost when you fail over.
+ At this point, your **secondary** node contains an up-to-date copy of everything the
+ **primary** node has, meaning nothing is lost when you fail over.
1. In this final step, you need to permanently disable the **primary** node.
@@ -198,7 +198,7 @@ follow these steps to avoid unnecessary data loss:
- If you do not have SSH access to the **primary** node, take the machine offline and
prevent it from rebooting. Since there are many ways you may prefer to accomplish
- this, we will avoid a single recommendation. You may need to:
+ this, we avoid a single recommendation. You may need to:
- Reconfigure the load balancers.
- Change DNS records (for example, point the **primary** DNS record to the
@@ -240,7 +240,7 @@ To promote the secondary node:
1. Run the following command to list out all preflight checks and automatically
check if replication and verification are complete before scheduling a planned
- failover to ensure the process will go smoothly:
+ failover to ensure the process goes smoothly:
NOTE:
In GitLab 13.7 and earlier, if you have a data type with zero items to sync,
diff --git a/doc/administration/geo/glossary.md b/doc/administration/geo/glossary.md
index 1ec552326aa..f8769d31ec7 100644
--- a/doc/administration/geo/glossary.md
+++ b/doc/administration/geo/glossary.md
@@ -21,7 +21,7 @@ these definitions yet.
| Term | Definition | Scope | Discouraged synonyms |
|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-------------------------------------------------|
-| Node | An individual server that runs GitLab either with a specific role or as a whole (e.g. a Rails application node). In a cloud context this can be a specific machine type. | GitLab | instance, server |
+| Node | An individual server that runs GitLab either with a specific role or as a whole (for example a Rails application node). In a cloud context this can be a specific machine type. | GitLab | instance, server |
| Site | One or a collection of nodes running a single GitLab application. A site can be single-node or multi-node. | GitLab | deployment, installation instance |
| Single-node site | A specific configuration of GitLab that uses exactly one node. | GitLab | single-server, single-instance
| Multi-node site | A specific configuration of GitLab that uses more than one node. | GitLab | multi-server, multi-instance, high availability |
@@ -31,7 +31,7 @@ these definitions yet.
| Reference architecture(s) | A [specified configuration of GitLab for a number of users](../reference_architectures/index.md), possibly including multiple nodes and multiple sites. | GitLab | |
| Promoting | Changing the role of a site from secondary to primary. | Geo-specific | |
| Demoting | Changing the role of a site from primary to secondary. | Geo-specific | |
-| Failover | The entire process that shifts users from a primary Site to a secondary site. This includes promoting a secondary, but contains other parts as well e.g. scheduling maintenance. | Geo-specific | |
+| Failover | The entire process that shifts users from a primary Site to a secondary site. This includes promoting a secondary, but contains other parts as well. For example, scheduling maintenance. | Geo-specific | |
## Examples
diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md
index 926c4c565aa..e8ffa1ae91a 100644
--- a/doc/administration/geo/replication/configuration.md
+++ b/doc/administration/geo/replication/configuration.md
@@ -196,9 +196,9 @@ keys must be manually replicated to the **secondary** node.
gitlab-ctl reconfigure
```
-1. On the top bar, select **Menu >** **{admin}** **Admin**.
+1. On the top bar of the primary node, select **Menu >** **{admin}** **Admin**.
1. On the left sidebar, select **Geo > Nodes**.
-1. Select **New node**.
+1. Select **Add site**.
![Add secondary node](img/adding_a_secondary_node_v13_3.png)
1. Fill in **Name** with the `gitlab_rails['geo_node_name']` in
`/etc/gitlab/gitlab.rb`. These values must always match *exactly*, character
diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md
index 6989765dbad..a56d9dc813c 100644
--- a/doc/administration/geo/replication/datatypes.md
+++ b/doc/administration/geo/replication/datatypes.md
@@ -209,6 +209,6 @@ successfully, you must replicate their data using some other means.
#### Limitation of verification for files in Object Storage
-GitLab managed Object Storage replication support [is in beta](object_storage.md#enabling-gitlab-managed-object-storage-replication).
+GitLab managed Object Storage replication support [is in beta](object_storage.md#enabling-gitlab-managed-object-storage-replication).
Locally stored files are verified but remote stored files are not.
diff --git a/doc/administration/geo/replication/docker_registry.md b/doc/administration/geo/replication/docker_registry.md
index cc0719442a1..5cc4f66017b 100644
--- a/doc/administration/geo/replication/docker_registry.md
+++ b/doc/administration/geo/replication/docker_registry.md
@@ -53,7 +53,7 @@ We need to make Docker Registry send notification events to the
registry['notifications'] = [
{
'name' => 'geo_event',
- 'url' => 'https://example.com/api/v4/container_registry_event/events',
+ 'url' => 'https://<example.com>/api/v4/container_registry_event/events',
'timeout' => '500ms',
'threshold' => 5,
'backoff' => '1s',
@@ -65,7 +65,8 @@ We need to make Docker Registry send notification events to the
```
NOTE:
- Replace `<replace_with_a_secret_token>` with a case sensitive alphanumeric string
+ Replace `<example.com>` with the `external_url` defined in your primary site's `/etc/gitlab/gitlab.rb` file, and
+ replace `<replace_with_a_secret_token>` with a case sensitive alphanumeric string
that starts with a letter. You can generate one with `< /dev/urandom tr -dc _A-Z-a-z-0-9 | head -c 32 | sed "s/^[0-9]*//"; echo`
NOTE:
@@ -109,11 +110,14 @@ For each application and Sidekiq node on the **secondary** site:
1. Copy `/var/opt/gitlab/gitlab-rails/etc/gitlab-registry.key` from the **primary** to the node.
-1. Edit `/etc/gitlab/gitlab.rb`:
+1. Edit `/etc/gitlab/gitlab.rb` and add:
```ruby
gitlab_rails['geo_registry_replication_enabled'] = true
- gitlab_rails['geo_registry_replication_primary_api_url'] = 'https://primary.example.com:5050/' # Primary registry address, it will be used by the secondary node to directly communicate to primary registry
+
+ # Primary registry's hostname and port, it will be used by
+ # the secondary node to directly communicate to primary registry
+ gitlab_rails['geo_registry_replication_primary_api_url'] = 'https://primary.example.com:5050/'
```
1. Reconfigure the node for the change to take effect:
diff --git a/doc/administration/geo/replication/faq.md b/doc/administration/geo/replication/faq.md
index ef41b2ff172..28030dccb3b 100644
--- a/doc/administration/geo/replication/faq.md
+++ b/doc/administration/geo/replication/faq.md
@@ -23,7 +23,7 @@ For each project to sync:
1. Geo issues a `git fetch geo --mirror` to get the latest information from the **primary** site.
If there are no changes, the sync is fast. Otherwise, it has to pull the latest commits.
-1. The **secondary** site updates the tracking database to store the fact that it has synced projects A, B, C, etc.
+1. The **secondary** site updates the tracking database to store the fact that it has synced projects A, B, C, and so on.
1. Repeat until all projects are synced.
When someone pushes a commit to the **primary** site, it generates an event in the GitLab database that the repository has changed.
@@ -46,8 +46,8 @@ Read the documentation for [Disaster Recovery](../disaster_recovery/index.md).
## What data is replicated to a **secondary** site?
We currently replicate project repositories, LFS objects, generated
-attachments / avatars and the whole database. This means user accounts,
-issues, merge requests, groups, project data, etc., will be available for
+attachments and avatars, and the whole database. This means user accounts,
+issues, merge requests, groups, project data, and so on, will be available for
query.
## Can I `git push` to a **secondary** site?
@@ -58,7 +58,7 @@ Yes! Pushing directly to a **secondary** site (for both HTTP and SSH, including
All replication operations are asynchronous and are queued to be dispatched. Therefore, it depends on a lot of
factors including the amount of traffic, how big your commit is, the
-connectivity between your sites, your hardware, etc.
+connectivity between your sites, your hardware, and so on.
## What if the SSH server runs at a different port?
diff --git a/doc/administration/geo/replication/location_aware_git_url.md b/doc/administration/geo/replication/location_aware_git_url.md
index 014ca59e571..a80c293149e 100644
--- a/doc/administration/geo/replication/location_aware_git_url.md
+++ b/doc/administration/geo/replication/location_aware_git_url.md
@@ -88,7 +88,7 @@ routing configurations.
![Created policy record](img/single_git_created_policy_record.png)
-You have successfully set up a single host, e.g. `git.example.com` which
+You have successfully set up a single host, for example, `git.example.com` which
distributes traffic to your Geo sites by geolocation!
## Configure Git clone URLs to use the special Git URL
diff --git a/doc/administration/geo/replication/remove_geo_node.md b/doc/administration/geo/replication/remove_geo_node.md
deleted file mode 100644
index b72cd3cbb95..00000000000
--- a/doc/administration/geo/replication/remove_geo_node.md
+++ /dev/null
@@ -1,9 +0,0 @@
----
-redirect_to: '../../geo/replication/remove_geo_site.md'
-remove_date: '2021-06-01'
----
-
-This document was moved to [another location](../../geo/replication/remove_geo_site.md).
-
-<!-- This redirect file can be deleted after 2021-06-01 -->
-<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
diff --git a/doc/administration/geo/replication/security_review.md b/doc/administration/geo/replication/security_review.md
index ae41599311b..966902a3d74 100644
--- a/doc/administration/geo/replication/security_review.md
+++ b/doc/administration/geo/replication/security_review.md
@@ -60,7 +60,7 @@ from [owasp.org](https://owasp.org/).
access), but is constrained to read-only activities. The principal use case is
envisioned to be cloning Git repositories from the **secondary** site in favor of the
**primary** site, but end-users may use the GitLab web interface to view projects,
- issues, merge requests, snippets, etc.
+ issues, merge requests, snippets, and so on.
### What security expectations do the end‐users have?
@@ -203,7 +203,7 @@ from [owasp.org](https://owasp.org/).
### What data entry paths does the application support?
- Data is entered via the web application exposed by GitLab itself. Some data is
- also entered using system administration commands on the GitLab servers (e.g.,
+ also entered using system administration commands on the GitLab servers (for example
`gitlab-ctl set-primary-node`).
- **Secondary** sites also receive inputs via PostgreSQL streaming replication from the **primary** site.
@@ -247,7 +247,7 @@ from [owasp.org](https://owasp.org/).
### What encryption requirements have been defined for data in transit - including transmission over WAN, LAN, SecureFTP, or publicly accessible protocols such as http: and https:?
- Data must have the option to be encrypted in transit, and be secure against
- both passive and active attack (e.g., MITM attacks should not be possible).
+ both passive and active attack (for example, MITM attacks should not be possible).
## Access
diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md
index c00f523957c..d63e927627a 100644
--- a/doc/administration/geo/replication/troubleshooting.md
+++ b/doc/administration/geo/replication/troubleshooting.md
@@ -327,7 +327,7 @@ Slots where `active` is `f` are not active.
- When this slot should be active, because you have a **secondary** node configured using that slot,
log in to that **secondary** node and check the PostgreSQL logs why the replication is not running.
-- If you are no longer using the slot (e.g. you no longer have Geo enabled), you can remove it with in the
+- If you are no longer using the slot (for example, you no longer have Geo enabled), you can remove it with in the
PostgreSQL console session:
```sql
@@ -378,7 +378,7 @@ This happens on wrongly-formatted addresses in `postgresql['md5_auth_cidr_addres
```
To fix this, update the IP addresses in `/etc/gitlab/gitlab.rb` under `postgresql['md5_auth_cidr_addresses']`
-to respect the CIDR format (i.e. `1.2.3.4/32`).
+to respect the CIDR format (that is, `1.2.3.4/32`).
### Message: `LOG: invalid IP mask "md5": Name or service not known`
@@ -390,7 +390,7 @@ This happens when you have added IP addresses without a subnet mask in `postgres
```
To fix this, add the subnet mask in `/etc/gitlab/gitlab.rb` under `postgresql['md5_auth_cidr_addresses']`
-to respect the CIDR format (i.e. `1.2.3.4/32`).
+to respect the CIDR format (that is, `1.2.3.4/32`).
### Message: `Found data in the gitlabhq_production database!` when running `gitlab-ctl replicate-geo-database`
@@ -588,6 +588,75 @@ to start again from scratch, there are a few steps that can help you:
gitlab-ctl start
```
+### Design repository failures on mirrored projects and project imports
+
+On the top bar, under **Menu >** **{admin}** **Admin > Geo > Nodes**,
+if the Design repositories progress bar shows
+`Synced` and `Failed` greater than 100%, and negative `Queued`, then the instance
+is likely affected by
+[a bug in GitLab 13.2 and 13.3](https://gitlab.com/gitlab-org/gitlab/-/issues/241668).
+It was [fixed in 13.4+](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/40643).
+
+To determine the actual replication status of design repositories in
+a [Rails console](../../operations/rails_console.md):
+
+```ruby
+secondary = Gitlab::Geo.current_node
+counts = {}
+secondary.designs.select("projects.id").find_each do |p|
+ registry = Geo::DesignRegistry.find_by(project_id: p.id)
+ state = registry ? "#{registry.state}" : "registry does not exist yet"
+ # puts "Design ID##{p.id}: #{state}" # uncomment this for granular information
+ counts[state] ||= 0
+ counts[state] += 1
+end
+puts "\nCounts:", counts
+```
+
+Example output:
+
+```plaintext
+Design ID#5: started
+Design ID#6: synced
+Design ID#7: failed
+Design ID#8: pending
+Design ID#9: synced
+
+Counts:
+{"started"=>1, "synced"=>2, "failed"=>1, "pending"=>1}
+```
+
+Example output if there are actually zero design repository replication failures:
+
+```plaintext
+Design ID#5: synced
+Design ID#6: synced
+Design ID#7: synced
+
+Counts:
+{"synced"=>3}
+```
+
+#### If you are promoting a Geo secondary site running on a single server
+
+`gitlab-ctl promotion-preflight-checks` will fail due to the existence of
+`failed` rows in the `geo_design_registry` table. Use the
+[previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports) to
+determine the actual replication status of Design repositories.
+
+`gitlab-ctl promote-to-primary-node` will fail since it runs preflight checks.
+If the [previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports)
+shows that all designs are synced, then you can use the
+`--skip-preflight-checks` option or the `--force` option to move forward with
+promotion.
+
+#### If you are promoting a Geo secondary site running on multiple servers
+
+`gitlab-ctl promotion-preflight-checks` will fail due to the existence of
+`failed` rows in the `geo_design_registry` table. Use the
+[previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports) to
+determine the actual replication status of Design repositories.
+
## Fixing errors during a failover or when promoting a secondary to a primary node
The following are possible errors that might be encountered during failover or
@@ -726,6 +795,7 @@ sudo gitlab-ctl promotion-preflight-checks
sudo /opt/gitlab/embedded/bin/gitlab-pg-ctl promote
sudo gitlab-ctl reconfigure
sudo gitlab-rake geo:set_secondary_as_primary
+```
## Expired artifacts
@@ -794,7 +864,7 @@ PostgreSQL instances:
The most common problems that prevent the database from replicating correctly are:
-- **Secondary** nodes cannot reach the **primary** node. Check credentials, firewall rules, etc.
+- **Secondary** nodes cannot reach the **primary** node. Check credentials, firewall rules, and so on.
- SSL certificate problems. Make sure you copied `/etc/gitlab/gitlab-secrets.json` from the **primary** node.
- Database storage disk is full.
- Database replication slot is misconfigured.
diff --git a/doc/administration/geo/replication/updating_the_geo_nodes.md b/doc/administration/geo/replication/updating_the_geo_nodes.md
index 0c68adf162d..03570048071 100644
--- a/doc/administration/geo/replication/updating_the_geo_nodes.md
+++ b/doc/administration/geo/replication/updating_the_geo_nodes.md
@@ -28,9 +28,9 @@ and all **secondary** nodes:
1. **Optional:** [Pause replication on each **secondary** node.](../index.md#pausing-and-resuming-replication)
1. Log into the **primary** node.
-1. [Update GitLab on the **primary** node using Omnibus's Geo-specific steps](https://docs.gitlab.com/omnibus/update/README.html#geo-deployment).
+1. [Update GitLab on the **primary** node using Omnibus](https://docs.gitlab.com/omnibus/update/#update-using-the-official-repositories).
1. Log into each **secondary** node.
-1. [Update GitLab on each **secondary** node using Omnibus's Geo-specific steps](https://docs.gitlab.com/omnibus/update/README.html#geo-deployment).
+1. [Update GitLab on each **secondary** node using Omnibus](https://docs.gitlab.com/omnibus/update/#update-using-the-official-repositories).
1. If you paused replication in step 1, [resume replication on each **secondary**](../index.md#pausing-and-resuming-replication)
1. [Test](#check-status-after-updating) **primary** and **secondary** nodes, and check version in each.
diff --git a/doc/administration/geo/replication/usage.md b/doc/administration/geo/replication/usage.md
index 1491aa3427e..7fe8eec467e 100644
--- a/doc/administration/geo/replication/usage.md
+++ b/doc/administration/geo/replication/usage.md
@@ -27,7 +27,7 @@ Everything up-to-date
```
NOTE:
-If you're using HTTPS instead of [SSH](../../../ssh/README.md) to push to the secondary,
+If you're using HTTPS instead of [SSH](../../../ssh/index.md) to push to the secondary,
you can't store credentials in the URL like `user:password@URL`. Instead, you can use a
[`.netrc` file](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html)
for Unix-like operating systems or `_netrc` for Windows. In that case, the credentials
diff --git a/doc/administration/geo/replication/version_specific_updates.md b/doc/administration/geo/replication/version_specific_updates.md
index 301be931b29..e193fc630b9 100644
--- a/doc/administration/geo/replication/version_specific_updates.md
+++ b/doc/administration/geo/replication/version_specific_updates.md
@@ -11,16 +11,35 @@ Review this page for update instructions for your version. These steps
accompany the [general steps](updating_the_geo_nodes.md#general-update-steps)
for updating Geo nodes.
+## Updating to GitLab 13.12
+
+We found an issue where [secondary nodes re-download all LFS files](https://gitlab.com/gitlab-org/gitlab/-/issues/334550) upon update. This bug:
+
+- Only applies to Geo secondary sites that have replicated LFS objects.
+- Is _not_ a data loss risk.
+- Causes churn and wasted bandwidth re-downloading all LFS objects.
+- May impact performance for GitLab installations with a large number of LFS files.
+
+If you don't have many LFS objects or can stand a bit of churn, then it is safe to let the secondary sites re-download LFS objects.
+If you do have many LFS objects, or many Geo secondary sites, or limited bandwidth, or a combination of them all, then we recommend you skip GitLab 13.12.0 through 13.12.6 and update to GitLab 13.12.7 or newer.
+
+### If you have already updated to an affected version, and the re-sync is ongoing
+
+You can manually migrate the legacy sync state to the new state column by running the following command in a [Rails console](../../operations/rails_console.md). It should take under a minute:
+
+```ruby
+Geo::LfsObjectRegistry.where(state: 0, success: true).update_all(state: 2)
+```
+
## Updating to GitLab 13.11
-We found an [issue with Git clone/pull through HTTP(s)](https://gitlab.com/gitlab-org/gitlab/-/issues/330787) on Geo secondaries and on any GitLab instance if maintenance mode is enabled. This was caused by a regression in GitLab Workhorse. This is fixed in the [GitLab 13.11.4 patch release](https://about.gitlab.com/releases/2021/05/14/gitlab-13-11-4-released/). To avoid this issue, upgrade to GitLab 13.11.4 or later.
+We found an [issue with Git clone/pull through HTTP(s)](https://gitlab.com/gitlab-org/gitlab/-/issues/330787) on Geo secondaries and on any GitLab instance if maintenance mode is enabled. This was caused by a regression in GitLab Workhorse. This is fixed in the [GitLab 13.11.4 patch release](https://about.gitlab.com/releases/2021/05/14/gitlab-13-11-4-released/). To avoid this issue, upgrade to GitLab 13.11.4 or later.
## Updating to GitLab 13.9
We've detected an issue [with a column rename](https://gitlab.com/gitlab-org/gitlab/-/issues/324160)
-that may prevent upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3.
-We are working on a patch, but until a fixed version is released, you can manually complete
-the zero-downtime upgrade:
+that will prevent upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3 when following the zero-downtime steps. It is necessary
+to perform the following additional steps for the zero-downtime upgrade:
1. Before running the final `sudo gitlab-rake db:migrate` command on the deploy node,
execute the following queries using the PostgreSQL console (or `sudo gitlab-psql`)
@@ -40,9 +59,18 @@ the zero-downtime upgrade:
```
If you have already run the final `sudo gitlab-rake db:migrate` command on the deploy node and have
-encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you can still
-follow the previous steps to complete the update.
+encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you will
+see the following error:
+
+```shell
+-- remove_column(:application_settings, :asset_proxy_whitelist)
+rake aborted!
+StandardError: An error has occurred, all later migrations canceled:
+PG::DependentObjectsStillExist: ERROR: cannot drop column asset_proxy_whitelist of table application_settings because other objects depend on it
+DETAIL: trigger trigger_0d588df444c8 on table application_settings depends on column asset_proxy_whitelist of table application_settings
+```
+To work around this bug, follow the previous steps to complete the update.
More details are available [in this issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160).
## Updating to GitLab 13.7
diff --git a/doc/administration/geo/setup/database.md b/doc/administration/geo/setup/database.md
index f6e72092a5f..bc4128deb4a 100644
--- a/doc/administration/geo/setup/database.md
+++ b/doc/administration/geo/setup/database.md
@@ -31,17 +31,17 @@ A single instance database replication is easier to set up and still provides th
as a clusterized alternative. It's useful for setups running on a single machine
or trying to evaluate Geo for a future clusterized installation.
-A single instance can be expanded to a clusterized version using Patroni, which is recommended for a
+A single instance can be expanded to a clusterized version using Patroni, which is recommended for a
highly available architecture.
-Follow below the instructions on how to set up PostgreSQL replication as a single instance database.
-Alternatively, you can look at the [Multi-node database replication](#multi-node-database-replication)
+Follow below the instructions on how to set up PostgreSQL replication as a single instance database.
+Alternatively, you can look at the [Multi-node database replication](#multi-node-database-replication)
instructions on setting up replication with a Patroni cluster.
### PostgreSQL replication
The GitLab **primary** node where the write operations happen connects to
-the **primary** database server, and **secondary** nodes
+the **primary** database server, and **secondary** nodes
connect to their own database servers (which are also read-only).
We recommend using [PostgreSQL replication slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
@@ -112,13 +112,13 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o
# must be present in all application nodes.
gitlab_rails['db_password'] = '<your_password_here>'
```
-
+
1. Define a password for the database [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication).
We will use the username defined in `/etc/gitlab/gitlab.rb` under the `postgresql['sql_replication_user']`
- setting. The default value is `gitlab_replicator`, but if you changed it to something else, adapt
+ setting. The default value is `gitlab_replicator`, but if you changed it to something else, adapt
the instructions below.
-
+
Generate a MD5 hash of the desired password:
```shell
@@ -462,10 +462,10 @@ data before running `pg_basebackup`.
- If PostgreSQL is listening on a non-standard port, add `--port=` as well.
- If your database is too large to be transferred in 30 minutes, you need
- to increase the timeout, e.g., `--backup-timeout=3600` if you expect the
+ to increase the timeout, for example, `--backup-timeout=3600` if you expect the
initial replication to take under an hour.
- Pass `--sslmode=disable` to skip PostgreSQL TLS authentication altogether
- (e.g., you know the network path is secure, or you are using a site-to-site
+ (for example, you know the network path is secure, or you are using a site-to-site
VPN). This is **not** safe over the public Internet!
- You can read more details about each `sslmode` in the
[PostgreSQL documentation](https://www.postgresql.org/docs/12/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
@@ -484,12 +484,12 @@ The replication process is now complete.
### PgBouncer support (optional)
[PgBouncer](https://www.pgbouncer.org/) may be used with GitLab Geo to pool
-PostgreSQL connections, which can improve performance even when using in a
-single instance installation.
+PostgreSQL connections, which can improve performance even when using in a
+single instance installation.
-We recommend using PgBouncer if you use GitLab in a highly available
-configuration with a cluster of nodes supporting a Geo **primary** site and
-two other clusters of nodes supporting a Geo **secondary** site. One for the
+We recommend using PgBouncer if you use GitLab in a highly available
+configuration with a cluster of nodes supporting a Geo **primary** site and
+two other clusters of nodes supporting a Geo **secondary** site. One for the
main database and the other for the tracking database. For more information,
see [High Availability with Omnibus GitLab](../../postgresql/replication_and_failover.md).
@@ -505,7 +505,7 @@ If you still haven't [migrated from repmgr to Patroni](#migrating-from-repmgr-to
Patroni is the official replication management solution for Geo. It
can be used to build a highly available cluster on the **primary** and a **secondary** Geo site.
-Using Patroni on a **secondary** site is optional and you don't have to use the same amount of
+Using Patroni on a **secondary** site is optional and you don't have to use the same amount of
nodes on each Geo site.
For instructions about how to set up Patroni on the primary site, see the
@@ -515,13 +515,21 @@ For instructions about how to set up Patroni on the primary site, see the
In a Geo secondary site, the main PostgreSQL database is a read-only replica of the primary site’s PostgreSQL database.
-If you are currently using `repmgr` on your Geo primary site, see [these instructions](#migrating-from-repmgr-to-patroni) for migrating from `repmgr` to Patroni.
+If you are currently using `repmgr` on your Geo primary site, see [these instructions](#migrating-from-repmgr-to-patroni)
+for migrating from `repmgr` to Patroni.
+
+A production-ready and secure setup requires at least:
+
+- 3 Consul nodes _(primary and secondary sites)_
+- 2 Patroni nodes _(primary and secondary sites)_
+- 1 PgBouncer node _(primary and secondary sites)_
+- 1 internal load-balancer _(primary site only)_
+
+The internal load balancer provides a single endpoint for connecting to the Patroni cluster's leader whenever a new leader is
+elected, and it is required for enabling cascading replication from the secondary sites.
-A production-ready and secure setup requires at least three Consul nodes, three
-Patroni nodes, one internal load-balancing node on the primary site, and a similar
-configuration for the secondary site. The internal load balancer provides a single
-endpoint for connecting to the Patroni cluster's leader whenever a new leader is
-elected. Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni) and other database best practices.
+Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni)
+and other database best practices.
##### Step 1. Configure Patroni permanent replication slot on the primary site
@@ -542,12 +550,12 @@ Leader instance**:
```ruby
roles(['patroni_role'])
-
+
consul['services'] = %w(postgresql)
consul['configuration'] = {
retry_join: %w[CONSUL_PRIMARY1_IP CONSUL_PRIMARY2_IP CONSUL_PRIMARY3_IP]
}
-
+
# You need one entry for each secondary, with a unique name following PostgreSQL slot_name constraints:
#
# Configuration syntax is: 'unique_slotname' => { 'type' => 'physical' },
@@ -559,6 +567,8 @@ Leader instance**:
patroni['use_pg_rewind'] = true
patroni['postgresql']['max_wal_senders'] = 8 # Use double of the amount of patroni/reserved slots (3 patronis + 1 reserved slot for a Geo secondary).
patroni['postgresql']['max_replication_slots'] = 8 # Use double of the amount of patroni/reserved slots (3 patronis + 1 reserved slot for a Geo secondary).
+ patroni['username'] = 'PATRONI_API_USERNAME'
+ patroni['password'] = 'PATRONI_API_PASSWORD'
patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD'
# We list all secondary instances as they can all become a Standby Leader
@@ -719,27 +729,41 @@ For each Patroni instance on the secondary site:
patroni['standby_cluster']['host'] = 'INTERNAL_LOAD_BALANCER_PRIMARY_IP'
patroni['standby_cluster']['port'] = INTERNAL_LOAD_BALANCER_PRIMARY_PORT
patroni['standby_cluster']['primary_slot_name'] = 'geo_secondary' # Or the unique replication slot name you setup before
+ patroni['username'] = 'PATRONI_API_USERNAME'
+ patroni['password'] = 'PATRONI_API_PASSWORD'
patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD'
patroni['use_pg_rewind'] = true
patroni['postgresql']['max_wal_senders'] = 5 # A minimum of three for one replica, plus two for each additional replica
patroni['postgresql']['max_replication_slots'] = 5 # A minimum of three for one replica, plus two for each additional replica
-
+
postgresql['pgbouncer_user_password'] = 'PGBOUNCER_PASSWORD_HASH'
postgresql['sql_replication_password'] = 'POSTGRESQL_REPLICATION_PASSWORD_HASH'
postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH'
postgresql['listen_address'] = '0.0.0.0' # You can use a public or VPC address here instead
-
+
gitlab_rails['dbpassword'] = 'POSTGRESQL_PASSWORD'
gitlab_rails['enable'] = true
gitlab_rails['auto_migrate'] = false
```
1. Reconfigure GitLab for the changes to take effect.
- This is required to bootstrap PostgreSQL users and settings:
+ This is required to bootstrap PostgreSQL users and settings.
- ```shell
- gitlab-ctl reconfigure
- ```
+ - If this is a fresh installation of Patroni:
+
+ ```shell
+ gitlab-ctl reconfigure
+ ```
+
+ - If you are configuring a Patroni standby cluster on a site that previously had a working Patroni cluster:
+
+ ```shell
+ gitlab-ctl stop patroni
+ rm -rf /var/opt/gitlab/postgresql/data
+ /opt/gitlab/embedded/bin/patronictl -c /var/opt/gitlab/patroni/patroni.yaml remove postgresql-ha
+ gitlab-ctl reconfigure
+ gitlab-ctl start patroni
+ ```
### Migrating from repmgr to Patroni
@@ -769,17 +793,17 @@ by following the same instructions above.
Secondary sites use a separate PostgreSQL installation as a tracking database to
keep track of replication status and automatically recover from potential replication issues.
Omnibus automatically configures a tracking database when `roles(['geo_secondary_role'])` is set.
-If you want to run this database in a highly available configuration, follow the instructions below.
-A production-ready and secure setup requires at least three Consul nodes, three
-Patroni nodes on the secondary site secondary site. Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni) and other database best practices.
+If you want to run this database in a highly available configuration, don't use the `geo_secondary_role` above.
+Instead, follow the instructions below.
-#### Step 1. Configure a PgBouncer node on the secondary site
+A production-ready and secure setup requires at least three Consul nodes, two
+Patroni nodes and one PgBouncer node on the secondary site.
-A production-ready and highly available configuration requires at least
-three Consul nodes, three PgBouncer nodes, and one internal load-balancing node.
-The internal load balancer provides a single endpoint for connecting to the
-PgBouncer cluster. For more information, see [High Availability with Omnibus GitLab](../../postgresql/replication_and_failover.md).
+Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni)
+and other database best practices.
+
+#### Step 1. Configure a PgBouncer node on the secondary site
Follow the minimal configuration for the PgBouncer node for the tracking database:
@@ -880,6 +904,8 @@ For each Patroni instance on the secondary site for the tracking database:
]
# Patroni configuration
+ patroni['username'] = 'PATRONI_API_USERNAME'
+ patroni['password'] = 'PATRONI_API_PASSWORD'
patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD'
patroni['postgresql']['max_wal_senders'] = 5 # A minimum of three for one replica, plus two for each additional replica
diff --git a/doc/administration/geo/setup/external_database.md b/doc/administration/geo/setup/external_database.md
index 9e187424afa..3ec84f1268b 100644
--- a/doc/administration/geo/setup/external_database.md
+++ b/doc/administration/geo/setup/external_database.md
@@ -57,7 +57,7 @@ developed and tested. We aim to be compatible with most external
To set up an external database, you can either:
-- Set up [streaming replication](https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION-SLOTS) yourself (for example AWS RDS, bare metal not managed by Omnibus, etc.).
+- Set up [streaming replication](https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION-SLOTS) yourself (for example AWS RDS, bare metal not managed by Omnibus, and so on).
- Perform the Omnibus configuration manually as follows.
#### Leverage your cloud provider's tools to replicate the primary database
@@ -208,8 +208,8 @@ the tracking database on port 5432.
1. Set up PostgreSQL according to the
[database requirements document](../../../install/requirements.md#database).
1. Set up a `gitlab_geo` user with a password of your choice, create the `gitlabhq_geo_production` database, and make the user an owner of the database. You can see an example of this setup in the [installation from source documentation](../../../install/installation.md#6-database).
-1. If you are **not** using a cloud-managed PostgreSQL database, ensure that your secondary
- node can communicate with your tracking database by manually changing the
+1. If you are **not** using a cloud-managed PostgreSQL database, ensure that your secondary
+ node can communicate with your tracking database by manually changing the
`pg_hba.conf` that is associated with your tracking database.
Remember to restart PostgreSQL afterwards for the changes to take effect:
diff --git a/doc/administration/geo/setup/index.md b/doc/administration/geo/setup/index.md
index 1afa4360cbc..84dff69ebe7 100644
--- a/doc/administration/geo/setup/index.md
+++ b/doc/administration/geo/setup/index.md
@@ -9,24 +9,24 @@ type: howto
These instructions assume you have a working instance of GitLab. They guide you through:
-1. Making your existing instance the **primary** node.
-1. Adding **secondary** nodes.
+1. Making your existing instance the **primary** site.
+1. Adding **secondary** site(s).
WARNING:
-The steps below should be followed in the order they appear. **Make sure the GitLab version is the same on all nodes.**
+The steps below should be followed in the order they appear. **Make sure the GitLab version is the same on all sites.**
## Using Omnibus GitLab
If you installed GitLab using the Omnibus packages (highly recommended):
-1. [Install GitLab Enterprise Edition](https://about.gitlab.com/install/) on the server that will serve as the **secondary** node. Do not create an account or log in to the new **secondary** node.
-1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary** node to unlock Geo. The license must be for [GitLab Premium](https://about.gitlab.com/pricing/) or higher.
+1. [Install GitLab Enterprise Edition](https://about.gitlab.com/install/) on the node(s) that will serve as the **secondary** site. Do not create an account or log in to the new **secondary** site.
+1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary** site to unlock Geo. The license must be for [GitLab Premium](https://about.gitlab.com/pricing/) or higher.
1. [Set up the database replication](database.md) (`primary (read-write) <-> secondary (read-only)` topology).
-1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md). This step is required and needs to be done on **both** the **primary** and **secondary** nodes.
-1. [Configure GitLab](../replication/configuration.md) to set the **primary** and **secondary** nodes.
-1. Optional: [Configure a secondary LDAP server](../../auth/ldap/index.md) for the **secondary** node. See [notes on LDAP](../index.md#ldap).
+1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md). This step is required and needs to be done on **both** the **primary** and **secondary** site(s).
+1. [Configure GitLab](../replication/configuration.md) to set the **primary** and **secondary** site(s).
+1. Optional: [Configure a secondary LDAP server](../../auth/ldap/index.md) for the **secondary** site(s). See [notes on LDAP](../index.md#ldap).
1. Follow the [Using a Geo Site](../replication/usage.md) guide.
## Post-installation documentation
-After installing GitLab on the **secondary** nodes and performing the initial configuration, see the [following documentation for post-installation information](../index.md#post-installation-documentation).
+After installing GitLab on the **secondary** site(s) and performing the initial configuration, see the [following documentation for post-installation information](../index.md#post-installation-documentation).