Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/gitaly/configure_gitaly.md')
-rw-r--r--doc/administration/gitaly/configure_gitaly.md132
1 files changed, 132 insertions, 0 deletions
diff --git a/doc/administration/gitaly/configure_gitaly.md b/doc/administration/gitaly/configure_gitaly.md
index e128682af75..8c784232371 100644
--- a/doc/administration/gitaly/configure_gitaly.md
+++ b/doc/administration/gitaly/configure_gitaly.md
@@ -817,6 +817,138 @@ repository. In the example above:
You can observe the behavior of this queue using the Gitaly logs and Prometheus. For more
information, see the [relevant documentation](monitoring.md#monitor-gitaly-concurrency-limiting).
+## Control groups
+
+> - Introduced in GitLab 13.10.
+> - New version of the configuration was introduced in GitLab 15.0.
+
+Gitaly shells out to Git for many of its operations. Git can consume a lot of resources for certain operations,
+especially for large repositories.
+
+Control groups (cgroups) in Linux allow limits to be imposed on how much memory and CPU can be consumed.
+See the [`cgroups` Linux man page](https://man7.org/linux/man-pages/man7/cgroups.7.html) for more information.
+cgroups can be useful for protecting the system against resource exhaustion because of overcomsumption of memory and CPU.
+
+Gitaly has built-in cgroups control. When configured, Gitaly assigns Git
+processes to a cgroup based on the repository the Git command is operating in.
+Each cgroup has a memory and CPU limit. When a cgroup reaches its:
+
+- Memory limit, the kernel looks through the processes for a candidate to kill.
+- CPU limit, processes are not killed, but the processes are prevented from consuming more CPU than allowed.
+
+The main reason to configure cgroups for your GitLab installation is that it
+protects against system resource starvation due to a few large repositories or
+bad actors.
+
+Some Git operations are expensive by nature. `git clone`, for instance,
+spawns a `git-upload-pack` process on the server that can consume a lot of memory
+for large repositories. For example, a client that keeps on cloning a
+large repository over and over again. This situation could potentially use up all of the
+memory on a server, causing other operations to fail for other users.
+
+There are many ways someone can create a repository that can consume large amounts of memory when cloned or downloaded.
+Using cgroups allows the kernel to kill these operations before they hog up all system resources.
+
+### Configure cgroups in Gitaly
+
+How you configure cgroups in Gitaly depends on what version of GitLab you use.
+
+#### GitLab 13.10 to GitLab 14.10
+
+To configure cgroups in Gitaly for GitLab versions 13.10 to 14.10, add `gitaly['cgroups']` to `/etc/gitlab/gitlab.rb`. For
+example:
+
+```ruby
+# in /etc/gitlab/gitlab.rb
+gitaly['cgroups_count'] = 1000
+gitaly['cgroups_mountpoint'] = "/sys/fs/cgroup"
+gitaly['cgroups_hierarchy_root'] = "gitaly"
+gitaly['cgroups_memory_limit'] = 32212254720
+gitaly['cgroups_memory_enabled'] = true
+gitaly['cgroups_cpu_shares'] = 1024
+gitaly['cgroups_cpu_enabled'] = true
+
+```
+
+- `cgroups_count` is the number of cgroups created. Each time a new
+ command is spawned, Gitaly assigns it to one of these cgroups based
+ on the command line arguments of the command. A circular hashing algorithm assigns
+ commands to these cgroups.
+- `cgroups_mountpoint` is where the parent cgroup directory is mounted. Defaults to `/sys/fs/cgroup`.
+- `cgroups_hierarchy_root` is the parent cgroup under which Gitaly creates groups, and
+ is expected to be owned by the user and group Gitaly runs as. Omnibus GitLab
+ creates the set of directories `mountpoint/<cpu|memory>/hierarchy_root`
+ when Gitaly starts.
+- `cgroups_memory_enabled` enables or disables the memory limit on cgroups.
+- `cgroups_memory_bytes` is the total memory limit each cgroup imposes on the processes added to it.
+- `cgroups_cpu_enabled` enables or disables the CPU limit on cgroups.
+- `cgroups_cpu_shares` is the CPU limit each cgroup imposes on the processes added to it. The maximum is 1024 shares,
+ which represents 100% of CPU.
+ which represents 100% of CPU.
+
+#### GitLab 15.0 and later
+
+To configure cgroups in Gitaly for GitLab versions 15.0 and later, add `gitaly['cgroups']` to `/etc/gitlab/gitlab.rb`. For
+example:
+
+```ruby
+# in /etc/gitlab/gitlab.rb
+gitaly['cgroups_mountpoint'] = "/sys/fs/cgroup"
+gitaly['cgroups_hierarchy_root'] =>"gitaly"
+gitaly['cgroups_memory_bytes'] = 64424509440, # 60gb
+gitaly['cgroups_cpu_shares'] = 1024
+gitaly['cgroups_repositories_count'] => 1000,
+gitaly['cgroups_repositories_memory_bytes'] => 32212254720 # 20gb
+gitaly['cgroups_repositories_cpu_shares'] => 512
+```
+
+- `cgroups_mountpoint` is where the parent cgroup directory is mounted. Defaults to `/sys/fs/cgroup`.
+- `cgroups_hierarchy_root` is the parent cgroup under which Gitaly creates groups, and
+ is expected to be owned by the user and group Gitaly runs as. Omnibus GitLab
+ creates the set of directories `mountpoint/<cpu|memory>/hierarchy_root`
+ when Gitaly starts.
+- `cgroups_memory_bytes` is the total memory limit that is imposed collectively on all
+ Git processes that Gitaly spawns. 0 implies no limit.
+- `cgroups_cpu_shares` is the cpu limit that is imposed collectively on all Git
+ processes that Gitaly spawns. 0 implies no limit. The maximum is 1024 shares,
+ which represents 100% of CPU.
+- `cgroups_repositories_count` is the number of cgroups in the cgroups pool. Each time a new Git
+ command is spawned, Gitaly assigns it to one of these cgroups based
+ on the repository the command is for. A circular hashing algorithm assigns
+ Git commands to these cgroups, so a Git command for a repository is
+ always assigned to the same cgroup.
+- `cgroups_repositories_memory_bytes` is the total memory limit that is imposed collectively on all
+ Git processes that Gitaly spawns. 0 implies no limit. This value cannot exceed
+ that of the top level `cgroups_memory_bytes`.
+- `cgroups_repositories_cpu_shares` is the CPU limit that is imposed collectively on all Git
+ processes Gitaly spawns. 0 implies no limit. The maximum is 1024 shares,
+ which represents 100% of CPU. This value cannot exceed that of the top
+ level`cgroups_cpu_shares`.
+
+The difference in the cgroups configuration in GitLab 15.0 and later is that we create a pool of cgroups that are isolated
+based on the repository used in the Git command to be placed under one of these cgroups.
+
+### Configuring oversubscription
+
+In the previous example configuration for GitLab 15.0 and later:
+
+- The top level memory limit is capped at 60gb.
+- Each of the 1000 cgroups in the repositories pool is capped at 20gb.
+
+This is called "oversubscription". Each cgroup in the pool has a much larger capacity than 1/1000th
+of the top-level memory limit.
+
+This strategy has two main benefits:
+
+- It gives the host protection from overall memory starvation (OOM), because the top-level
+ cgroup's memory limit can be set to a threshold smaller than the host's
+ capacity. Processes outside of that cgroup are not at risk of OOM.
+- It allows each individual cgroup in the pool to burst up to a generous upper
+ bound (in this example 20 GB) that is smaller than the parent cgroup's limit,
+ but substantially larger than 1/N of the parent's limit. In this example, up
+ to 3 child cgroups can concurrently burst up to their max. In general, all
+ 1000 cgroups would use much less than the 20 GB.
+
## Background Repository Optimization
Empty directories and unneeded configuration settings may accumulate in a repository and