diff options
author | John Cai <jcai@gitlab.com> | 2022-04-20 06:30:22 +0300 |
---|---|---|
committer | John Cai <jcai@gitlab.com> | 2022-04-20 06:30:22 +0300 |
commit | 051d510a384ffa12f02a14ff292cdbf3e141505a (patch) | |
tree | dc60cce072ba466a83970714718592d641cad5e7 | |
parent | 5591e2b54cff1fbfa38d19a3747c18fb847f9b4a (diff) |
docs: Document Gitaly backpressure
There are a number of knobs in Gitaly to tune backpressure Gitaly can impose on services that call it. This commit documents these.
-rw-r--r-- | README.md | 1 | ||||
-rw-r--r-- | doc/README.md | 1 | ||||
-rw-r--r-- | doc/backpressure.md | 100 |
3 files changed, 102 insertions, 0 deletions
@@ -170,6 +170,7 @@ For more information on how to set it up, see the [LabKit monitoring docs](https - [How to configure backpressure in Gitaly](https://youtu.be/wX9CtFdLYxE) An overview of the knobs in the Gitaly config to set limits on incoming traffic. + There is also [written documentation](doc/backpressure.md). - [How Gitaly fits into GitLab (Youtube)](https://www.youtube.com/playlist?list=PL05JrBw4t0KqoFUiX42JG7BAc7pipMBAy) - a series of 1-hour training videos for contributors new to GitLab and Gitaly. - [Part 1: the Gitaly client in gitlab-ce, 2019-02-21](https://www.youtube.com/watch?v=j0HNiKCnLTI&list=PL05JrBw4t0KqoFUiX42JG7BAc7pipMBAy) diff --git a/doc/README.md b/doc/README.md index 00951449c..e72f5fae3 100644 --- a/doc/README.md +++ b/doc/README.md @@ -39,6 +39,7 @@ For configuration please read [praefects configuration documentation](doc/config - [Serverside Git Usage](serverside_git_usage.md) - [Object Pools](object_pools.md) - [Sidechannel protocol](sidechannel.md) +- [Backpressure](backpressure.md) #### RFCs diff --git a/doc/backpressure.md b/doc/backpressure.md new file mode 100644 index 000000000..8452d4fd9 --- /dev/null +++ b/doc/backpressure.md @@ -0,0 +1,100 @@ +# Request limiting in Gitaly + +In the GitLab ecosystem, Gitaly is the service that is at the bottom of the +stack for Git data access. This means that when there is a surge of +requests to retrieve or change a piece of Git data, the I/O happens in Gitaly. +This can lead to Gitaly being overwhelmed due to system resource exhaustion +because all Git access goes through Gitaly. + +If there is a surge of traffic beyond what Gitaly can handle, Gitaly should +be able to push back on the client calling. Gitaly shouldn't subserviently agree +to process more than it can handle. + +We can turn several different knobs in Gitaly that put a limit on different kinds +of traffic patterns. + +## Concurrency queue + +Limit the number of concurrent RPCs that are in flight on each Gitaly node for each +repository per RPC using `[[concurrency]]` configuration: + +```toml +[[concurrency]] +rpc = "/gitaly.SmartHTTPService/PostUploadPackWithSidechannel" +max_per_repo = 1 +``` + +For example: + +- One clone request comes in for repository "A" (a largish repository). +- While this RPC is executing, another request comes in for repository "A". Because + `max_per_repo` is 1 in this case, the second request blocks until the first request + is finished. + +An in-memory queue of requests can build up in Gitaly that are waiting their turn. Because +this is a potential vector for a memory leak, two other values in the `[[concurrency]]` +configuration can prevent an unbounded in-memory queue of requests: + +- `max_queue_wait` is the maximum amount of time a request can wait in the + concurrency queue. When a request waits longer than this time, it returns + an error to the client. +- `max_queue_size` is the maximum size the concurrency queue can grow for a given + RPC for a repository. If a concurrency queue is at its maximum, subsequent requests + return with an error. + +For example: + +```toml +[[concurrency]] +rpc = "/gitaly.SmartHTTPService/PostUploadPackWithSidechannel" +max_per_repo = 1 +max_queue_wait = "1m" +max_queue_size = 5 +``` + +## Rate limiting + +To allow Gitaly to put back pressure on its clients, administrators can set a rate limit per +repository for each RPC: + +```toml +[[rate_limiting]] +rpc = "/gitaly.RepositoryService/RepackFull" +interval = "1m" +burst = 1 +``` + +The rate limiter is implemented using the concept of a `token bucket`. A `token +bucket` has capacity `burst` and is refilled at an interval of `interval`. When a +request comes into Gitaly, a token is retrieved from the `token bucket` per +request. When the `token bucket` is empty, there are no more requests for that +RPC for a repository until the `token bucket` is refilled again. There is a `token bucket` +each RPC for each repository. + +In the above configuration, the `token bucket` has a capacity of 1 and gets +refilled every minute. This means that Gitaly only accepts 1 `RepackFull` +request per repository each minute. + +Requests that come in after the `token bucket` is full (and before it is +replenished) are rejected with an error. + +## Errors + +With concurrency limiting and rate limiting, Gitaly responds with a structured +gRPC `gitalypb.LimitError` error with: + +- A `Message` field that describes the error. +- A `BackoffDuration` field that provides the client with a time when it is safe to retry. + If 0, it means it should never retry. + +Gitaly clients (`gitlab-shell`, `workhorse`, Rails) must parse this error and +return sensible error messages to the end user. For example: + +- Something trying to clone using HTTP or SSH. +- The GitLab application. +- Something calling the API. + +## Metrics + +Metrics are available that provide visibility into how these limits are being applied. +See the [GitLab Documentation](https://docs.gitlab.com/ee/administration/gitaly/#monitor-gitaly-and-gitaly-cluster) for details. |