Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorSteve Azzopardi <sazzopardi@gitlab.com>2023-02-27 15:38:37 +0300
committerSteve Azzopardi <sazzopardi@gitlab.com>2023-03-13 11:09:10 +0300
commit80ce55b0fb2388d11f0016d0b470a3778b00e057 (patch)
treea3ae49e768d1e7e20a78dd21e3b6e4964f06cd98 /doc
parent65769c7a58d3339fe94a809bf6fd34f2f300a700 (diff)
Cgroups: add cpu_quota_us limit
What --- - Add a new configuration under `cgroups` called `cpu_quota_us` to configure `cfs_quota_us` for the parent cgroup https://docs.kernel.org/scheduler/sched-bwc.html?highlight=cfs_quota_us - Add a new configuration under `cgroups.repositories` called `cpu_quota_us` to configure `cfs_quota_us` for the repository cgroup https://docs.kernel.org/scheduler/sched-bwc.html?highlight=cfs_quota_us - Add metrics - `gitaly_cgroup_cpu_cfs_periods_total`: Read from `cpu.stat` nr_periods https://docs.kernel.org/scheduler/sched-bwc.html#statistics - `gitaly_cgroup_cpu_cfs_throttled_periods_total`: Read from `cpu.stat` nr_throttled https://docs.kernel.org/scheduler/sched-bwc.html#statistics - `gitaly_cgroup_cpu_cfs_throttled_seconds_total`: Read from `cpu.stat` throttled_time https://docs.kernel.org/scheduler/sched-bwc.html#statistics - Add more test coverage when only specific values are set. Why --- At the moment we limit memory and CPU via [`cpu.shares`](https://kernel.googlesource.com/pub/scm/linux/kernel/git/glommer/memcg/+/cpu_stat/Documentation/cgroups/cpu.txt) which will only throttle a cgroup when there is contention on the CPU. This means that potentially a single repository can still hog all of the CPU on a gitaly node. We've seen a case of this in https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8318, a single repository saturated the CPU, and the scheduler couldn't balance the CPU for other tasks/requests to be scheduled. We hoped CPU shares would be enough, but we need an upper CPU quota for gitaly cgroups so no single repository can fully saturate the CPU. There are a few concerns that are addressed Concern 1: cfs_period_us `cfs_period_us` is used to calculate the `cfs_quota_us` (what we are setting now), the default value seems to be [hardcoded](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/sched/fair.c?h=v5.15.92#n5492) but the Linux kernel but this can be updated, so Gitaly is explicitly settings this to 100ms (default value) Concern 2: not using `cfs_burst_us` This could allow for CPU bursts, even when they exceed the `cfs_quota_us`, we don't set this because it's available on the newer kernel versions (5.15). The way users can avoid throttling is by oversubscribing `cfs_quota_us` Concern 3: Wasting available resources When the user sets these we'll be artificially limiting the CPU that they consume, this can leave performance on the table when a repository is using all its quota, and no other process is using the CPU. This is the only drawback and one we are willing to take since it adds more reliability in the long run. We can reduce the effect of this by oversubscribing. Concern 4: Observability The kernel already exports [stats](https://docs.kernel.org/scheduler/sched-bwc.html#statistics) which Gitaly exposes as, and also [cadvisor](https://github.com/google/cadvisor/blob/master/docs/storage/prometheus.md#prometheus-container-metrics) Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/17332 Changelog: added Signed-off-by: Steve Azzopardi <sazzopardi@gitlab.com>
Diffstat (limited to 'doc')
-rw-r--r--doc/cgroups.md68
1 files changed, 59 insertions, 9 deletions
diff --git a/doc/cgroups.md b/doc/cgroups.md
index a6c265565..0eb012c8a 100644
--- a/doc/cgroups.md
+++ b/doc/cgroups.md
@@ -17,13 +17,16 @@ mountpoint = "/sys/fs/cgroup"
hierarchy_root = "gitaly"
memory_bytes = 64424509440 # 60gb
cpu_shares = 1024
+cpu_quota_us = 400000
```
**mountpoint** is the top level directory where cgroups will be created.
**hierarchy_root** is the parent cgroup under which Gitaly creates cgroups.
**memory_bytes** limits all processes created by Gitaly to a memory limit,
collectively.
-**cpu_shares** limits all processes created by Gitaly to a cpu limit, collectively
+**cpu_shares** limits all processes created by Gitaly when there are multiple
+processes competing for CPU.
+**cpu_quota_us** hard limit for all processes created by Gitaly.
### Repository Groups
@@ -31,16 +34,20 @@ Cgroups that have a repository-level isolation can also be defined:
```toml
[cgroups.repositories]
-count = 10000
+count = 1000
memory_bytes = 12884901888 # 12gb
cpu_shares = 512
+cpu_quota_us = 200000
```
**count** is the number of cgroups to create.
-**memory_bytes** limits [memory](#memory-limits) for processes within one cgroup.
+**memory_bytes** limits [memory](#memory-limits) for processes in one cgroup.
This number cannot exceed the top level memory limit.
-**cpu_shares** limits [cpu](#cpu-limits) for processes within one cgroup. This
-number cannot exceed the top level cpu limit.
+**cpu_shares** limits [CPU shares](#cpu-shares) for processes in one cgroup when
+there are multiple processes competing for CPU. This number cannot exceed the
+top level CPU limit.
+**cpu_quota_us** hard limit [CPU quota](#cpu-quotas) for all processes created by
+Gitaly. This number cannot exceed the top level CPU quota.
These cgroups will be created when Gitaly starts up. A circular hashing algorithm
is used to assign repositories to cgroups. So when we reach the max number of
@@ -68,11 +75,32 @@ operations can easily take 12gb of ram as seen in production systems.
Hence, we also need finer grained controls to allow certain expensive Git
operations to have their own cgroups.
-## CPU Limits
+## CPU Shares
-Each cgroup has cpu limits as defined by a concept called cpu shares. By
-definition, full usage of a machine's CPU is 1024 shares. Anything lower than
-that will be a fraction of the total CPU resources a machine has access to.
+Each cgroup has a configured share of CPU resources. This setting is used when
+there are multiple processes competing for CPU, otherwise the cgroup has access
+to all CPU resources. For more information, see
+[`cpu.shares`](https://www.redhat.com/sysadmin/cgroups-part-two).
+
+All the CPU resources of a machine are equal to 1024 shares. Lower values are
+a fraction of the total CPU resources.
+
+## CPU Quotas
+
+CPU quota sets
+[`cfs_quota_us`](https://docs.kernel.org/scheduler/sched-bwc.html?highlight=cfs_quota_us#management),
+which is a hard limit of how much CPU a process can use.
+
+The quota specifies a maximum amount of CPU time usable per period of wall
+clock time. The quota refills at the start of each period. Gitaly explicitly
+sets the refill period to match the [kernel default](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/sched/fair.c?h=v5.15.92#n4807)
+of 100 ms (`cfs_period_us=100000`). So setting the quota (`cfs_quota_us`) to
+the same value `100000` (100 ms) is equivalent to budgeting 1 CPU core for the
+cgroup to use.
+
+This means a 4-core machine has a maximum `cfs_quota_us` of `400000`. If we
+want to use 2 of those cores for Gitaly spawned `git` processes, we would set
+the value of `cfs_quota_us` to `200000`.
## Cgroup Hierarchy
@@ -112,24 +140,46 @@ that will be a fraction of the total CPU resources a machine has access to.
| |--cpu.shares
| |--repos-0
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-1
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-2
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-3
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-4
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-5
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-6
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-7
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-8
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-9
| | |--cpu.shares
+| | |--cpu.cfs_period_us
+| | |--cpu.cfs_quota_us
| |--repos-10
| |--cpu.shares
+| |--cpu.cfs_period_us
+| |--cpu.cfs_quota_us
```