diff options
author | Stan Hu <stanhu@gmail.com> | 2021-09-28 22:39:39 +0300 |
---|---|---|
committer | Stan Hu <stanhu@gmail.com> | 2021-09-28 22:39:39 +0300 |
commit | e0e5f87af7533d5d0916bf1f8ea9a0a028ca804a (patch) | |
tree | cab914c33be7535b9b969b9bfac6d3f95504670a | |
parent | 8d6bd3101ef8d08cfedf50a65307ca58fa97805f (diff) |
Add note on performance impact of checksums
-rw-r--r-- | doc/rfcs/recovery.md | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/doc/rfcs/recovery.md b/doc/rfcs/recovery.md index 80458653b..2fb0d2620 100644 --- a/doc/rfcs/recovery.md +++ b/doc/rfcs/recovery.md @@ -91,6 +91,15 @@ by issuing a `SELECT MAX(generation) FROM storage_repositories`. If a `GROUP BY(checksum), COUNT(*)` to find the consistent storage by majority vote. +#### Performance impact of checksums + +For a large repository, doing a [full recalculation of the +checksum](https://gitlab.com/gitlab-org/gitlab/-/issues/5196#note_73300281) +can take several seconds and increases linerally with the number of refs +in the repository. Because of this, we will have to explore how best to +cache this information on a Gitaly node (e.g. a local database) and +ensure that old and new values can be XOR'ed out easily. + #### Limitations of checksums Checksums only ensure the contents of the Git references are correct, |