Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/geo/disaster_recovery/background_verification.md')
-rw-r--r--doc/administration/geo/disaster_recovery/background_verification.md172
1 files changed, 172 insertions, 0 deletions
diff --git a/doc/administration/geo/disaster_recovery/background_verification.md b/doc/administration/geo/disaster_recovery/background_verification.md
new file mode 100644
index 00000000000..7d2fd51f834
--- /dev/null
+++ b/doc/administration/geo/disaster_recovery/background_verification.md
@@ -0,0 +1,172 @@
+# Automatic background verification **[PREMIUM ONLY]**
+
+NOTE: **Note:**
+Automatic background verification of repositories and wikis was added in
+GitLab EE 10.6 but is enabled by default only on GitLab EE 11.1. You can
+disable or enable this feature manually by following
+[these instructions](#disabling-or-enabling-the-automatic-background-verification).
+
+Automatic background verification ensures that the transferred data matches a
+calculated checksum. If the checksum of the data on the **primary** node matches checksum of the
+data on the **secondary** node, the data transferred successfully. Following a planned failover,
+any corrupted data may be **lost**, depending on the extent of the corruption.
+
+If verification fails on the **primary** node, this indicates that Geo is
+successfully replicating a corrupted object; restore it from backup or remove it
+it from the **primary** node to resolve the issue.
+
+If verification succeeds on the **primary** node but fails on the **secondary** node,
+this indicates that the object was corrupted during the replication process.
+Geo actively try to correct verification failures marking the repository to
+be resynced with a backoff period. If you want to reset the verification for
+these failures, so you should follow [these instructions][reset-verification].
+
+If verification is lagging significantly behind replication, consider giving
+the node more time before scheduling a planned failover.
+
+## Disabling or enabling the automatic background verification
+
+Run the following commands in a Rails console on the **primary** node:
+
+```sh
+# Omnibus GitLab
+gitlab-rails console
+
+# Installation from source
+cd /home/git/gitlab
+sudo -u git -H bin/rails console RAILS_ENV=production
+```
+
+To check if automatic background verification is enabled:
+
+```ruby
+Gitlab::Geo.repository_verification_enabled?
+```
+
+To disable automatic background verification:
+
+```ruby
+Feature.disable('geo_repository_verification')
+```
+
+To enable automatic background verification:
+
+```ruby
+Feature.enable('geo_repository_verification')
+```
+
+## Repository verification
+
+Navigate to the **Admin Area > Geo** dashboard on the **primary** node and expand
+the **Verification information** tab for that node to view automatic checksumming
+status for repositories and wikis. Successes are shown in green, pending work
+in grey, and failures in red.
+
+![Verification status](img/verification-status-primary.png)
+
+Navigate to the **Admin Area > Geo** dashboard on the **secondary** node and expand
+the **Verification information** tab for that node to view automatic verification
+status for repositories and wikis. As with checksumming, successes are shown in
+green, pending work in grey, and failures in red.
+
+![Verification status](img/verification-status-secondary.png)
+
+## Using checksums to compare Geo nodes
+
+To check the health of Geo **secondary** nodes, we use a checksum over the list of
+Git references and their values. The checksum includes `HEAD`, `heads`, `tags`,
+`notes`, and GitLab-specific references to ensure true consistency. If two nodes
+have the same checksum, then they definitely hold the same references. We compute
+the checksum for every node after every update to make sure that they are all
+in sync.
+
+## Repository re-verification
+
+> [Introduced](https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/8550) in GitLab Enterprise Edition 11.6. Available in [GitLab Premium](https://about.gitlab.com/pricing/).
+
+Due to bugs or transient infrastructure failures, it is possible for Git
+repositories to change unexpectedly without being marked for verification.
+Geo constantly reverifies the repositories to ensure the integrity of the
+data. The default and recommended re-verification interval is 7 days, though
+an interval as short as 1 day can be set. Shorter intervals reduce risk but
+increase load and vice versa.
+
+Navigate to the **Admin Area > Geo** dashboard on the **primary** node, and
+click the **Edit** button for the **primary** node to customize the minimum
+re-verification interval:
+
+![Re-verification interval](img/reverification-interval.png)
+
+The automatic background re-verification is enabled by default, but you can
+disable if you need. Run the following commands in a Rails console on the
+**primary** node:
+
+```sh
+# Omnibus GitLab
+gitlab-rails console
+
+# Installation from source
+cd /home/git/gitlab
+sudo -u git -H bin/rails console RAILS_ENV=production
+```
+
+To disable automatic background re-verification:
+
+```ruby
+Feature.disable('geo_repository_reverification')
+```
+
+To enable automatic background re-verification:
+
+```ruby
+Feature.enable('geo_repository_reverification')
+```
+
+## Reset verification for projects where verification has failed
+
+Geo actively try to correct verification failures marking the repository to
+be resynced with a backoff period. If you want to reset them manually, this
+rake task marks projects where verification has failed or the checksum mismatch
+to be resynced without the backoff period:
+
+For repositories:
+
+- Omnibus Installation
+
+ ```sh
+ sudo gitlab-rake geo:verification:repository:reset
+ ```
+
+- Source Installation
+
+ ```sh
+ sudo -u git -H bundle exec rake geo:verification:repository:reset RAILS_ENV=production
+ ```
+
+For wikis:
+
+- Omnibus Installation
+
+ ```sh
+ sudo gitlab-rake geo:verification:wiki:reset
+ ```
+
+- Source Installation
+
+ ```sh
+ sudo -u git -H bundle exec rake geo:verification:wiki:reset RAILS_ENV=production
+ ```
+
+## Current limitations
+
+Until [issue #5064][ee-5064] is completed, background verification doesn't cover
+CI job artifacts and traces, LFS objects, or user uploads in file storage.
+Verify their integrity manually by following [these instructions][foreground-verification]
+on both nodes, and comparing the output between them.
+
+Data in object storage is **not verified**, as the object store is responsible
+for ensuring the integrity of the data.
+
+[reset-verification]: background_verification.md#reset-verification-for-projects-where-verification-has-failed
+[foreground-verification]: ../../raketasks/check.md
+[ee-5064]: https://gitlab.com/gitlab-org/gitlab-ee/issues/5064