diff options
author | Jacob Vosmaer <jacob@gitlab.com> | 2019-10-02 13:20:50 +0300 |
---|---|---|
committer | Jacob Vosmaer <jacob@gitlab.com> | 2019-10-02 13:20:50 +0300 |
commit | 863c6b516554a6ca2da3763d308bf24ff8c3f79c (patch) | |
tree | 826cb86dc491b8f6be5a8ff0c9d94731e5a81482 | |
parent | 950b34432b20a1059d29b46801b0c9233f9769dd (diff) | |
parent | 02787fab4f65eea6fc6f778cdc2dfd2c239cff7b (diff) |
Merge branch 'po-diskcache-docs' into 'master'
Disk cache documentation
Closes #1919
See merge request gitlab-org/gitaly!1528
-rw-r--r-- | doc/README.md | 1 | ||||
-rw-r--r-- | doc/design_diskcache.md | 119 |
2 files changed, 120 insertions, 0 deletions
diff --git a/doc/README.md b/doc/README.md index 39e60caf8..f0316287b 100644 --- a/doc/README.md +++ b/doc/README.md @@ -27,6 +27,7 @@ For configuration please read [praefects configuration documentation](doc/config #### Technical explanations - [Delta Islands](delta_islands.md) +- [Disk-based Cache](design_diskcache.md) #### Proposals diff --git a/doc/design_diskcache.md b/doc/design_diskcache.md new file mode 100644 index 000000000..66f6723f9 --- /dev/null +++ b/doc/design_diskcache.md @@ -0,0 +1,119 @@ +# Disk Cache Design + +Gitaly utilizes a disk-based cache for efficiently serving some RPC responses +(at time of writing, only the `SmartHTTPService.InfoRefUploadPack` RPC). This +cache is intended to be used for serving large responses not suitable for a RAM +based cache. + +## Cache Invalidation + +The mechanisms that enable the invalidation of the disk cache for a repo depend +on special annotations made to the Gitaly gRPC methods. Each method that has +scope "repository" and is operation type "mutator" will cause the specified +repository to be invalidated. For more information on the annotation system, +see the Gitaly protobuf definition [contributing guide]. + +[contributing guide]: https://gitlab.com/gitlab-org/gitaly/tree/4c27a7f71ba1d91edbc9d321919620887d6a30d3/proto#rpc-annotations + +## Repository State + +For every repository using the disk cache, a special set of files is maintained +to indicate which cached responses are still valid. These files are stored +in a dedicated **state directory** for each repository: + + ${STATE_DIR} = ${STORAGE_PATH}/+gitaly/state/${REPO_RELATIVE_PATH} + +Before a mutating RPC handler is invoked, a gRPC middleware creates a "lease" +file in the state directory that signifies a mutating operation is in-flight. +These lease files reside at the following path: + + ${STATE_DIR}/pending/${RANDOM_FILENAME} + +Upon the completion of the mutating RPC, the lease file will be removed and +the "latest" file will be updated with a random value to reflect the new +"state" of the repository. + + ${STATE_DIR}/latest + +The contents of latest are used along with several other values to form an +aggregate key that addresses a specific request for a specific repository at a +specific repository state: + +``` + ─────┐ + │ + latest (file contents)│ + RPC request (digest) │ ┌──────┐ + Gitaly version (string) ├─────│SHA256│─────▶ Cache key + RPC Method (string) │ └──────┘ + │ + ─────┘ +``` + +## Cache State Machine + +The repository state files are used to determine whether the repository is in +a deterministic state (i.e. no mutating RPCs in-flight) and how to find the +valid cached responses for the current repository state. The state machine +diagram follows: + +```mermaid +graph TD; + A[Are there lease files?]-->|Yes|B; + A-->|No|C; + B[Are any lease files stale?]-->|Yes|D; + B-->|No|E; + C[Does non-stale latest file exist?]-->|Yes|F; + C-->|No|G; + D[Remove stale lease files]-->A; + E[Mutator RPC In-Flight: Cache state indeterministic] + F[No mutator RPCs In-Flight: Cache state deterministic] + G[Create/Truncate latest file]-->F + + classDef nonfinal fill:#ccf,stroke-width; + classDef final fill:#f9f,stroke-dasharray: 5, 5; + + class A,B,C,D,G nonfinal; + class E,F final; +``` + +Note: There are momentary race conditions where an RPC may become in flight +between the time the lease files are checked and the latest file is inspected, +but this is allowed by the cache design in order to avoid distributed locking. +This means that a stale cached response might be served momentarily, but this +slight delay in fresh responses is a small tradeoff necessary to keep the cache +lockless. The lockless quality is highly desired since Gitaly operates on NFS +servers where file locks are not advisable. + +## Cached Responses + +When the repository is determined to be in a deterministic state (i.e. no +in-flight mutator RPCs), it is safe to cache responses and retrieve cached +responses. The aggregate key digest is used to form a hexadecimal path to the +cached response in this format: + + ${STORAGE_PATH}/+gitaly/cache/${DIGEST:0:2}/${DIGEST:2} + +Note: the first two characters of the digest are used as a subdirectory to +allow the random distribution of the digest algorithm (SHA256) to evenly +distribute the response files. This way, the digest files are evenly +distributed across 256 folders. + +## File Cleanup + +Since the disk cache introduces a number of new filesystem constructs, both +state files and cached responses, there needs to be a way to clean up these +files when the normal processes are not adequate. + +Gitaly runs background workers that periodically remove stale (>1 hour old) +state files and cached responses. Additionally, Gitaly will remove the cached +responses on program start to guard against any chance that the cache +invalidator was not working in a previous run. + +## Relevant Code + +- [Disk cache and lease keyer](internal/cache) +- [gRPC Middleware Cache Invalidator](internal/middleware/cache) +- [gRPC annotations](internal/praefect/protoregistry) +- [InfoRef cache](internal/service/smarthttp/cache.go) + |