diff options
author | John Cai <jcai@gitlab.com> | 2022-04-07 22:29:07 +0300 |
---|---|---|
committer | John Cai <jcai@gitlab.com> | 2022-04-14 23:12:33 +0300 |
commit | 020722161c681dc4f5208b3646d413b9b5b2639a (patch) | |
tree | ce8f1334393dfa0301e4690736833750970b4780 | |
parent | 93153d53f1c77a28ef76ae9c5777ed5477835962 (diff) |
docs: Document Gitaly backpressurejc-docs-backpressure
There are a couple of knobs we can turn in Gitaly in terms of
backpressure. Concurrency queue & limits, and rate limiting. This change
documents both.
-rw-r--r-- | README.md | 1 | ||||
-rw-r--r-- | doc/README.md | 1 | ||||
-rw-r--r-- | doc/backpressure.md | 119 |
3 files changed, 121 insertions, 0 deletions
@@ -170,6 +170,7 @@ For more information on how to set it up, see the [LabKit monitoring docs](https - [How to configure backpressure in Gitaly](https://youtu.be/wX9CtFdLYxE) An overview of the knobs in the Gitaly config to set limits on incoming traffic. + There is also [written documentation](doc/backpressure.md). - [How Gitaly fits into GitLab (Youtube)](https://www.youtube.com/playlist?list=PL05JrBw4t0KqoFUiX42JG7BAc7pipMBAy) - a series of 1-hour training videos for contributors new to GitLab and Gitaly. - [Part 1: the Gitaly client in gitlab-ce, 2019-02-21](https://www.youtube.com/watch?v=j0HNiKCnLTI&list=PL05JrBw4t0KqoFUiX42JG7BAc7pipMBAy) diff --git a/doc/README.md b/doc/README.md index 00951449c..e72f5fae3 100644 --- a/doc/README.md +++ b/doc/README.md @@ -39,6 +39,7 @@ For configuration please read [praefects configuration documentation](doc/config - [Serverside Git Usage](serverside_git_usage.md) - [Object Pools](object_pools.md) - [Sidechannel protocol](sidechannel.md) +- [Backpressure](backpressure.md) #### RFCs diff --git a/doc/backpressure.md b/doc/backpressure.md new file mode 100644 index 000000000..d2b1b7c1c --- /dev/null +++ b/doc/backpressure.md @@ -0,0 +1,119 @@ +# Request Limiting in Gitaly + +## The Problem + +In the GitLab ecosystem, Gitaly is the service that is at the bottom of the +stack as far as Git data access goes. This means that when there is a surge of +requests to retrieve or change a piece of Git data, the I/O happens in Gitaly. +This can lead to Gitaly being overloadeded due to system resource exhaustion, +since all roads lead to Gitaly. + +## The Solution + +If there is a surge of traffic beyond what Gitaly can handle, Gitaly should +be able to push back on the client calling it instead of subserviently agreeing +to bite off much more than it can chew. + +There are several different knobs we can turn in Gitaly that put a limit on +different kinds of traffic patterns. + +### Concurrency Queue + +There is a way to limit the number of concurrent RPCs that are in flight per +Gitaly node/repository/RPC. This is done through the `[[concurrency]]` +configuration: + +```toml +[[concurrency]] +rpc = "/gitaly.SmartHTTPService/PostUploadPackWithSidechannel" +max_per_repo = 1 +``` + +Let's say that 1 clone request come in for repo "A", and "A" is a largish +repository. While this RPC is executing, another request comes in for repo "A". +Since `max_per_repo` is 1 in this case, the second request will block until the +first request is finished. + +In this way, an in memory queue of requests can build up in Gitaly that are +waiting their turn. Since this is a potential vector for a memory leak, there +are two other values in the `[[concurrency]]` config to prevent an unbounded in +memory queue of requests. + +```toml +[[concurrency]] +rpc = "/gitaly.SmartHTTPService/PostUploadPackWithSidechannel" +max_per_repo = 1 +max_queue_wait = "1m" +max_queue_size = 5 +``` + +`max_queue_wait` is the maximum number of time a request can wait in the +concurrency queue. When a request waits longer than this time, it simply return +to the client with an error. + +`max_queue_size` is the maximum size the concurrency queue can grow for a given +repository/rpc. If a concurrency queue is at its maximum, subsequent requests +will return with an error. + +### Rate Limiting + +Another way to allow Gitaly to put backpressure on its clients is through rate +limiting. Admins can set a rate limit per repository/rpc: + +```toml +[[rate_limiting]] +rpc = "/gitaly.RepositoryService/RepackFull" +interval = "1m" +burst = 1 +``` + +The rate limiter is implemented using the concept of a `token bucket`. A `token +bucket` has capacity `burst` and is refilled at an interval of `interval`. When a +request comes into Gitaly, a token is retrieved fro the `token bucket` per +request. When the `token bucket` is empty, there are no more requests for that +repository/rpc until the `token bucket` is refilled again. There is a `token bucket` +per repository/rpc. + +In the above configuration, the `token bucket` has a capacity of 1 and gets +refilled every minute. This means that Gitaly will only accept 1 `RepackFull` +request per repository each minute. + +Requests that come in after the `token bucket` is full, and before it is +replenished are rejected with an error. + +## Errors + +With concurrency limiting as well as rate limiting, Gitaly will respond with a +structured gRPC error of the type `gitalypb.LimitError` with a `Message` field +that describes the error, and a `BackoffDuration` field that provides +the client with a time when it is safe to retry. If 0, it means it should never +retry. + +Gitaly clients (gitlab-shell, workhorse, rails) all need to parse this error and +return sensible error messages to the end producer whether it be something +trying to clone via http or ssh, the GitLab application, or something calling +the API. + +## Metrics + +There are metrics that provide visibility into how these limits are being +applied. + +**gitaly_requests_dropped_total** - Total number of requests dropped by Gitaly +due to request limiting. **reason** is a label that indicates why a request was +dropped. + - **rate** indicates the request was dropped because the rate exceeded the + configured limit. + - **max_size** indicates the request was dropped because the concurrency + queue's size was at the configured maximum. + - **max_time** indicates the request was dropped because the wait time + exceeded the configured maximum. + +**gitaly_concurrency_limiting_acquiring_seconds** - How long a request has to +wait due to concurrency limits before being processed. + +**gitaly_concurrency_limiting_in_progress** - How many concurrent requests are +being processed currently. + +**gitaly_concurrency_limiting** - How large the concurrency queue is. + |