Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/gitaly')
-rw-r--r--doc/administration/gitaly/index.md15
-rw-r--r--doc/administration/gitaly/praefect.md5
-rw-r--r--doc/administration/gitaly/recovery.md4
-rw-r--r--doc/administration/gitaly/troubleshooting.md44
4 files changed, 56 insertions, 12 deletions
diff --git a/doc/administration/gitaly/index.md b/doc/administration/gitaly/index.md
index ea24e901943..f88a1aa2b4d 100644
--- a/doc/administration/gitaly/index.md
+++ b/doc/administration/gitaly/index.md
@@ -13,7 +13,7 @@ It is used by GitLab to read and write Git data.
Gitaly is present in every GitLab installation and coordinates Git repository
storage and retrieval. Gitaly can be:
-- A simple background service operating on a single instance Omnibus GitLab (all of
+- A background service operating on a single instance Omnibus GitLab (all of
GitLab on one machine).
- Separated onto its own instance and configured in a full cluster configuration,
depending on scaling and availability requirements.
@@ -71,7 +71,7 @@ the current status of these issues, please refer to the referenced issues and ep
| Gitaly Cluster + Geo - Issues retrying failed syncs | If Gitaly Cluster is used on a Geo secondary site, repositories that have failed to sync could continue to fail when Geo tries to resync them. Recovering from this state requires assistance from support to run manual steps. Work is in-progress to update Gitaly Cluster to [identify repositories with a unique and persistent identifier](https://gitlab.com/gitlab-org/gitaly/-/issues/3485), which is expected to resolve the issue. | No known solution at this time. |
| Database inconsistencies due to repository access outside of Gitaly Cluster's control | Operations that write to the repository storage that do not go through normal Gitaly Cluster methods can cause database inconsistencies. These can include (but are not limited to) snapshot restoration for cluster node disks, node upgrades which modify files under Git control, or any other disk operation that may touch repository storage external to GitLab. The Gitaly team is actively working to provide manual commands to [reconcile the Praefect database with the repository storage](https://gitlab.com/groups/gitlab-org/-/epics/6723). | Don't directly change repositories on any Gitaly Cluster node at this time. |
| Praefect unable to insert data into the database due to migrations not being applied after an upgrade | If the database is not kept up to date with completed migrations, then the Praefect node is unable to perform normal operation. | Make sure the Praefect database is up and running with all migrations completed (For example: `/opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml sql-migrate-status` should show a list of all applied migrations). Consider [requesting live upgrade assistance](https://about.gitlab.com/support/scheduling-live-upgrade-assistance.html) so your upgrade plan can be reviewed by support. |
-| Restoring a Gitaly Cluster node from a snapshot in a running cluster | Because the Gitaly Cluster runs with consistent state, introducing a single node that is behind will result in the cluster not being able to reconcile the nodes data and other nodes data | Don't restore a single Gitaly Cluster node from a backup snapshot. If you need to restore from backup, it's best to snapshot all Gitaly Cluster nodes at the same time and take a database dump of the Praefect database. |
+| Restoring a Gitaly Cluster node from a snapshot in a running cluster | Because the Gitaly Cluster runs with consistent state, introducing a single node that is behind will result in the cluster not being able to reconcile the nodes data and other nodes data | Don't restore a single Gitaly Cluster node from a backup snapshot. If you must restore from backup, it's best to snapshot all Gitaly Cluster nodes at the same time and take a database dump of the Praefect database. |
### Snapshot backup and recovery limitations
@@ -179,7 +179,7 @@ In this example:
- Repositories are stored on a virtual storage called `storage-1`.
- Three Gitaly nodes provide `storage-1` access: `gitaly-1`, `gitaly-2`, and `gitaly-3`.
- The three Gitaly nodes share data in three separate hashed storage locations.
-- The [replication factor](#replication-factor) is `3`. There are three copies maintained
+- The [replication factor](#replication-factor) is `3`. Three copies are maintained
of each repository.
The availability objectives for Gitaly clusters assuming a single node failure are:
@@ -237,7 +237,7 @@ Engineering support for NFS for Git repositories is deprecated. Technical suppor
is not well suited to Git workloads which are CPU and IOPS sensitive.
Specifically:
-- Git is sensitive to file system latency. Even simple operations require many
+- Git is sensitive to file system latency. Some operations require many
read operations. Operations that are fast on block storage can become an order of
magnitude slower. This significantly impacts GitLab application performance.
- NFS performance optimizations that prevent the performance gap between
@@ -386,8 +386,7 @@ them back to direct Gitaly storage:
1. Create and configure a new
[Gitaly server](configure_gitaly.md#run-gitaly-on-its-own-server).
1. [Move the repositories](../operations/moving_repositories.md#move-repositories)
- to the newly created storage. There are different possibilities to move them
- by shard or by group, this gives you the opportunity to spread them over
+ to the newly created storage. You can move them by shard or by group, which gives you the opportunity to spread them over
multiple Gitaly servers.
## Monitor Gitaly and Gitaly Cluster
@@ -509,7 +508,7 @@ The following metrics are available from the `/metrics` endpoint:
They reflect configuration defined for this instance of Praefect.
- `gitaly_praefect_replication_latency_bucket`, a histogram measuring the amount of time it takes
- for replication to complete once the replication job starts. Available in GitLab 12.10 and later.
+ for replication to complete after the replication job starts. Available in GitLab 12.10 and later.
- `gitaly_praefect_replication_delay_bucket`, a histogram measuring how much time passes between
when the replication job is created and when it starts. Available in GitLab 12.10 and later.
- `gitaly_praefect_node_latency_bucket`, a histogram measuring the latency in Gitaly returning
@@ -542,7 +541,7 @@ You can also monitor the [Praefect logs](../logs.md#praefect-logs).
The following metrics are available from the `/db_metrics` endpoint:
- `gitaly_praefect_unavailable_repositories`, the number of repositories that have no healthy, up to date replicas.
-- `gitaly_praefect_read_only_repositories`, the number of repositories in read-only mode within a virtual storage.
+- `gitaly_praefect_read_only_repositories`, the number of repositories in read-only mode in a virtual storage.
This metric is available for backwards compatibility reasons. `gitaly_praefect_unavailable_repositories` is more
accurate.
- `gitaly_praefect_replication_queue_depth`, the number of jobs in the replication queue.
diff --git a/doc/administration/gitaly/praefect.md b/doc/administration/gitaly/praefect.md
index 8f17835b8a3..d7a06994b6c 100644
--- a/doc/administration/gitaly/praefect.md
+++ b/doc/administration/gitaly/praefect.md
@@ -616,7 +616,7 @@ Note the following:
- IP address, you must add it as a Subject Alternative Name to the certificate.
- When running Praefect sub-commands such as `dial-nodes` and `list-untracked-repositories` from the command line with
[Gitaly TLS enabled](configure_gitaly.md#enable-tls-support), you must set the `SSL_CERT_DIR` or `SSL_CERT_FILE`
- environment variable so that the Gitaly certificate is trusted. For example:
+ environment variable so that the Gitaly certificate is trusted. For example:
```shell
sudo SSL_CERT_DIR=/etc/gitlab/trusted_certs /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dial-nodes
@@ -1196,6 +1196,9 @@ You can configure:
current assignments: gitaly-1, gitaly-2
```
+If `default_replication_factor` is unset, the repositories are always replicated on every node defined in `virtual_storages`. If a new
+node is introduced to the virtual storage, both new and existing repositories are replicated to the node automatically.
+
## Automatic failover and primary election strategies
Praefect regularly checks the health of each Gitaly node. This is used to automatically fail over
diff --git a/doc/administration/gitaly/recovery.md b/doc/administration/gitaly/recovery.md
index 9210c8f08d3..a7166f7e62e 100644
--- a/doc/administration/gitaly/recovery.md
+++ b/doc/administration/gitaly/recovery.md
@@ -76,7 +76,7 @@ The following parameters are available:
some assigned copies that are not available.
NOTE:
-`dataloss` is still in beta and the output format is subject to change.
+`dataloss` is still in [Beta](../../policy/alpha-beta-support.md#beta-features) and the output format is subject to change.
To check for repositories with outdated primaries or for unavailable repositories, run:
@@ -377,7 +377,7 @@ sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.t
The `track-repository` Praefect sub-command adds repositories on disk to the Praefect database to be tracked.
```shell
-sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml track-repository -virtual-storage <virtual-storage> -repository <repository> -replicate-immediately
+sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml track-repository -virtual-storage <virtual-storage> -authoritative-storage <storage-name> -repository <repository> -replicate-immediately
```
- `-virtual-storage` is the virtual storage the repository is located in. Virtual storages are configured in `/etc/gitlab/gitlab.rb` under `praefect['virtual_storages]` and looks like the following:
diff --git a/doc/administration/gitaly/troubleshooting.md b/doc/administration/gitaly/troubleshooting.md
index ff136b44340..516af4ca469 100644
--- a/doc/administration/gitaly/troubleshooting.md
+++ b/doc/administration/gitaly/troubleshooting.md
@@ -554,7 +554,7 @@ Is [some cases](index.md#known-issues) the Praefect database can get out of sync
a given repository is fully synced on all nodes, run the [`gitlab:praefect:replicas` Rake task](../raketasks/praefect.md#replica-checksums)
that checksums the repository on all Gitaly nodes.
-The [Praefect dataloss](recovery.md#check-for-data-loss) command only checks the state of the repo in the Praefect database, and cannot
+The [Praefect dataloss](recovery.md#check-for-data-loss) command only checks the state of the repository in the Praefect database, and cannot
be relied to detect sync problems in this scenario.
### Relation does not exist errors
@@ -610,3 +610,45 @@ Possible solutions:
- Provision larger VMs to gain access to larger network traffic allowances.
- Use your cloud service's monitoring and logging to check that the Praefect nodes are not exhausting their traffic allowances.
+
+## Profiling Gitaly
+
+Gitaly exposes several of Golang's built-in performance profiling tools on the Prometheus listen port. For example, if Prometheus is listening
+on port `9236` of the GitLab server:
+
+- Get a list of running `goroutines` and their backtraces:
+
+ ```shell
+ curl --output goroutines.txt "http://<gitaly_server>:9236/debug/pprof/goroutine?debug=2"
+ ```
+
+- Run a CPU profile for 30 seconds:
+
+ ```shell
+ curl --output cpu.bin "http://<gitaly_server>:9236/debug/pprof/profile"
+ ```
+
+- Profile heap memory usage:
+
+ ```shell
+ curl --output heap.bin "http://<gitaly_server>:9236/debug/pprof/heap"
+ ```
+
+- Record a 5 second execution trace. This will impact Gitaly's performance while running:
+
+ ```shell
+ curl --output trace.bin "http://<gitaly_server>:9236/debug/pprof/trace?seconds=5"
+ ```
+
+On a host with `go` installed, the CPU profile and heap profile can be viewed in a browser:
+
+```shell
+go tool pprof -http=:8001 cpu.bin
+go tool pprof -http=:8001 heap.bin
+```
+
+Execution traces can be viewed by running:
+
+```shell
+go tool trace heap.bin
+```