Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-03-19command: Log cgroup pathjc-add-cgroup-cmd-loggingJohn Cai
We'd like to be able to tell if a command was run in a cgroup based on the log entry. This change adds a field in the Command struct so whenever the Cmd finishes, a cgroup field gets logged. Changelog: added
2022-03-18Merge branch 'pks-operations-squash-commit-voting' into 'master'Toon Claes
operations: Fix missing votes on squashed commits Closes #4109 See merge request gitlab-org/gitaly!4417
2022-03-18Merge branch 'pks-housekeeping-track-total-count-of-optimizations' into 'master'John Cai
housekeeping: Track total number of optimization calls See merge request gitlab-org/gitaly!4418
2022-03-18operations: Fix missing votes on squashed commitsPatrick Steinhardt
The UserSquash RPC computes a squashed commit by first rebasing a range of commits on top of another commit, and then collapsing these into a single commit. This RPC is notably different from almost all of our other RPCs because it never writes any references to disk, and neither does it ever execute any access checks as the other User RPCs do. This design is quite weird: - There is a known race window where the new objects are not referenced, so they could be pruned by maintenance calls. - We accept objects into the repository which may not be sanctioned by our access checks. - Replication jobs cannot replicate the squashed commit because they aren't referenced. - We never perform transactional voting because no references are updated. Together these problems show that the RPC call is misdesigned, but fixing this design would require a bigger refactoring to make it work alright in Rails. In this commit we fix the last bullet point though: because this RPC never performs transactional voting, we're always creating replication jobs after the call finishes because Praefect didn't observe any transactional votes. As mentioned though, we don't have any reference to vote on, so the best we can do is vote on the commit ID of the newly written squash commit to make sure that it is the same across nodes. This commit does so by introducing a quarantine directory that is used to stage all new objects first before they're migrated to the final repository. We then vote on the object ID of the staged squash commit. Only if this vote is successful will we successfully commit the object to disk. This commit thus solves two things: first it fixes the missing transactional voting. And second it causes us to discard all objects in case the RPC errors. Changelog: fixed
2022-03-18housekeeping: Fix Prometheus metrics' latency bucketspks-housekeeping-track-total-count-of-optimizationsPatrick Steinhardt
The housekeeping manager is hosting a latency metric which tracks how long specific tasks of the housekeeping manager take. But while latency metrics should typically use the GRPC latency buckets as configured in the Gitaly configuration, we accidentally didn't in the context of the housekeeping manager. Fix this bug by passing in the Prometheus configuration.
2022-03-18housekeeping: Track total number of optimization callsPatrick Steinhardt
While we already track how often we perform specific optimization tasks, we do not have any numbers around how often we perform optimizations in total. As a consequence it's harder than necessary to compute the relative frequency of each task. Fix this by adding another counter which tracks the total count of times optimizations were performed.
2022-03-18helper: Add error helpers for `codes.Aborted`Patrick Steinhardt
Add two new helpers `helper.ErrAborted()` and `helper.ErrAbortedf()` to generate gRPC errors with an `codes.Aborted` error code.
2022-03-17Merge branch 'update-danger-config-2' into 'master'Toon Claes
Update Danger config to use new type & subtype labels See merge request gitlab-org/gitaly!4387
2022-03-16Merge branch 'fix-error-handling-in-GetTreeEntries' into 'master'John Cai
Fix error handling in GetTreeEntries See merge request gitlab-org/gitaly!4414
2022-03-16Fix error handling in GetTreeEntriesHordur Freyr Yngvason
2022-03-16Merge branch 'pks-housekeeping-improved-metrics' into 'master'John Cai
housekeeping: Improve visibility into performed maintenance tasks Closes #4103 See merge request gitlab-org/gitaly!4406
2022-03-16Update changelog for 14.8.4GitLab Release Tools Bot
[ci skip]
2022-03-16Update VERSION filesv14.9.0-rc5GitLab Release Tools Bot
[ci skip]
2022-03-16Merge branch 'pks-git-v2.35.1.gl1-more-fetch-speedups' into 'master'Patrick Steinhardt
Makefile: Add more patches to speed up git-fetch(1) to v2.35.1.gl1 Closes #4104 See merge request gitlab-org/gitaly!4408
2022-03-16Merge branch 'brodock/create-repository-from-url-mirror' into 'master'James Fargher
CreateRepositoryFromURL switch to mirror flag See merge request gitlab-org/gitaly!4378
2022-03-15Repository: allow opting for --mirror for CreateRepositoryFromURLGabriel Mazetto
Changelog: added
2022-03-15Update `gitlab-dangerfiles` to 2.11.0update-danger-config-2Rémy Coutable
Signed-off-by: Rémy Coutable <remy@rymai.me>
2022-03-15Remove Danger rule that's part of `gitlab-dangerfiles`Rémy Coutable
This rule was introduced in https://gitlab.com/gitlab-org/ruby/gems/gitlab-dangerfiles/-/releases/v2.1.0 Signed-off-by: Rémy Coutable <remy@rymai.me>
2022-03-15Update Danger config to use new type & subtype labelsRémy Coutable
Signed-off-by: Rémy Coutable <remy@rymai.me>
2022-03-15Use the changelog rule from `gitlab-dangerfiles`Rémy Coutable
The rule was introduced in https://gitlab.com/gitlab-org/ruby/gems/gitlab-dangerfiles/-/releases/v2.9.0 Signed-off-by: Rémy Coutable <remy@rymai.me>
2022-03-15Use latest `gitlab-dangerfiles` featuresRémy Coutable
Use features introduced in https://gitlab.com/gitlab-org/ruby/gems/gitlab-dangerfiles/-/releases/v2.6.0 Signed-off-by: Rémy Coutable <remy@rymai.me>
2022-03-15Use the shared `danger-review` job definitionRémy Coutable
Signed-off-by: Rémy Coutable <remy@rymai.me>
2022-03-15Merge branch 'pks-housekeeping-prune-with-grace-period' into 'master'John Cai
housekeeping: Don't prune recent objects See merge request gitlab-org/gitaly!4410
2022-03-15housekeeping: Don't prune recent objectsPatrick Steinhardt
When running git-gc(1), Git automatically executes git-prune(1) with a grace period of two weeks: only when objects are older than that grace period will they be pruned. This is really important to fix a race condition with concurrent commands, which all will first write new objects into the repository before making them reachable via a ref update. This means that there is a brief period between writing the new objects and making them reachable. If objects would be pruned without a grace period, it could kick in right before updating the references and thus cause repository corruption. The two-weeks grace period bridges this gap by giving enough head room to finish the transaction. With the recent introduction of the heuristics-based OptimizeRepository RPC we have also started to use git-prune(1). The assumption was that the two-weeks grace period was the default of git-prune(1), but as it turns out it is only the default of git-gc(1). So because we are now running git-prune(1) without any parameters, we accidentally started to delete all unreachable objects, even if they had just been created. Curiously, this bug only started to surface as we were migrating Rails to use OptimizeRepository. Seemingly, because we previously only were executing the RPC as part of our nightly job, the logic didn't trigger in the right moment and thus never caused any (known) problems. Still, this is a serious bug. Fix this issue by passing `--expire=two.weeks.ago` to git-prune(1). Changelog: fixed
2022-03-15Merge branch 'pks-repository-creation-fix-indeterministic-voting' into 'master'John Cai
repository: Fix indeterministic voting when creating new repos Closes #4100 See merge request gitlab-org/gitaly!4402
2022-03-15Merge branch 'pks-user-squash-fix-error-code-with-improved-error-handling' ↵Sami Hiltunen
into 'master' operations: Fix wrong error code when UserSquash conflicts See merge request gitlab-org/gitaly!4403
2022-03-15Makefile: Add more patches to speed up git-fetch(1) to v2.35.1.gl1pks-git-v2.35.1.gl1-more-fetch-speedupsPatrick Steinhardt
The rollout of Git v2.35.1.gl1 is currently blocked in production because we have discovered various repositories where broken references cause errors with the new version. So right now, the new Git version is still disabled in all deployments, also because the feature flag is default-disabled, as well. This blockade gives us a last chance to sneak in some patches for Git v2.35.1.gl1 which have landed in `next` meanwhile. Missing this chance would mean that we'd have to wait a few more releases until we can land another revision, or otherwise we'd be running with three different Git versions in production at the same time: the current Git v2.33.1.gl3, then Git v2.35.1.gl1, and finally the new version we'd be about to introduce. So let's take this chance and slightly bend the rollout process of new Git versions. Apply commits from 60aae8731c (Merge branch 'ps/fetch-mirror-optim' into next, 2022-03-08) into Git v2.35.1.gl1. These patches optimize fetches by: - Patch 1/5 starts to look up up "want" lines via the commit-graph, resulting in a 7% speedup. - Patch 2/5 avoids repeated lookups of commits that were only required when writing to FETCH_HEAD, providing a 8% speedup. - Patches 3-5/5 start to skip reading packed-refs when searching for symbolic refs in the context of a fetch. This provides a 13% speedup. All benchmarks have been performed in www-gitlab-com. Changelog: performance
2022-03-15housekeeping: Track what kind of repacks OptimizeRepository doespks-housekeeping-improved-metricsPatrick Steinhardt
With OptimizeRepository becoming our new RPC which performs all repository maintenance in one place it's harder to tell what exactly is going on without proper metrics. One bit of information that gets lost is whether repacks are incremental or full repacks, which is quite an important distinction due to the impact on latency. Expose this information both via logs and via Prometheus to keep better track of it. While at it, alos exxpose information about whether we are writing bitmaps or not. Changelog: added
2022-03-15housekeeping: Expose pruned stale files via Prometheus metricPatrick Steinhardt
Expose the number of stale files we have pruned via a Prometheus metric so that it's easier to see what kinds of garbage we frequently need to clean up. Changelog: added
2022-03-15housekeeping: Generalize counter for pruned empty directoriesPatrick Steinhardt
We expose a Prometheus counter that is counting how many empty directories we have pruned. This is a broader concept in our housekeeping code, where we also prune other kinds of stale files. Generalize the counter into a counter vector such that we can reuse the same counter for all the different types of data we prune. While this breaks the metric in case it was used anywhere, there are no references to this counter across the complete GitLab group. Furthermore, we haven't ever guaranteed backwards compatibility for metrics anyway. Changelog: changed
2022-03-15housekeeping: Move global Prometheus metric into managerPatrick Steinhardt
While we have a central manager component which is supposed to hold all state related to housekeeping, we missed to migrate one of our metrics into it. Migrate it to get rid of one more global variable.
2022-03-15housekeeping: Log number of pruned ref dirs with other cleanupsPatrick Steinhardt
Because we may be deleting files which cause reference directories to become empty, pruning of empty reference directories needs to happen last in our housekeeping tasks. Because of this we also don't log info about pruned references in the log message which reports all the other removals. Refactor the code to return the number of pruned empty reference dirs from `removeRefEmptyDirs()` such that we can log them in the calling function with all the other cleanups.
2022-03-15housekeeping: Inline subtests which test pruning of specific filesPatrick Steinhardt
The test which verifies that we correctly prune specific files is using a subfunction to execute subtests, which makes the test hard to extend. Refactor it by inlining the subtests.
2022-03-15ci: Run Coverage job as unprivileged userPatrick Steinhardt
In 889450266 (ci: Run tests as unprivileged user, 2022-01-13) we have converted tests to run as unprivileged user. Back then we forgot to also adjust the Coverage job though, which is still running as a privileged user. Convert the job to also run tests unprivileged. This fixes an upcoming test failure we're about to introduce where housekeeping tasks remove a file that it shouldn't be able to because of a lack of permissions.
2022-03-15repository: Fix indeterministic voting when creating new repospks-repository-creation-fix-indeterministic-votingPatrick Steinhardt
When creating repositories we use transactional voting to determine that the repositories have been created the same on all nodes part of the transaction. This voting happens after we have seeded the repository, and the vote is computed by walking through the repository's directory and hashing all its files. We need to be careful though to skip files which we know to be indeterministic: - FETCH_HEAD may contain URLs which are different for each of the nodes. - Object packfiles contained in the object database are not deterministic, mostly because it may use multiple threads to compute deltas. Luckily, we do not have to rely on either of both types of files in order to ensure that the user-visible state of the repository is the same, so we can indeed just skip them. While we already have the logic to skip these files, this logic didn't work alright because we embarassingly forgot to actually return `fs.SkipDir` in case we see the object directory. So even though we thought we skipped these files, in reality we didn't. This bug has been manifesting in production in form of CreateFork, which regularly fails to reach quorum at random on a subset of nodes. The root cause here is that we use git-clone(1) to seed repository contents of the fork, which triggers exactly the case of indeterministic packfiles noted above. So any successful CreateFork RPC call really only succeeded by pure luck. Fix this issue by correctly skipping over "object" directories. While at it, fix how we skip over FETCH_HEAD by returning `nil`: it's a file and not a directory, so it doesn't make much sense to return `fs.SkipDir`. Changelog: fixed
2022-03-15Merge branch 'nanmu42-update-head' into 'master'Toon Claes
repository: Add updateHeadFromBundle in CreateRepositoryFromBundle Closes #4086 See merge request gitlab-org/gitaly!4401
2022-03-15repository: Add updateHeadFromBundle in CreateRepositoryFromBundleLI Zhennan
2022-03-14Update changelog for 14.8.3GitLab Release Tools Bot
[ci skip]
2022-03-14Merge branch 'smh-extend-invalid-source-repository' into 'master'James Fargher
Extend invalid metadata deletion logic to repos existin on target Closes #4083 See merge request gitlab-org/gitaly!4396
2022-03-11operations: Fix wrong error code when UserSquash conflictspks-user-squash-fix-error-code-with-improved-error-handlingPatrick Steinhardt
With the new improved error hadnling in UserSquash we're now returning errors in some cases where we previously didn't. One of those cases is when the rebase performed during the squash results in a merge conflict. While it is correct to return an error in this case, we're using an Internal error code for this case, which indicates that Gitaly is to blame instead of the parameters which have been passed by the user. Fix the error code to instead be FailedPrecondition. This error code is special-cased by our monitoring infrastructure to not raise any alerts. Note that this change is only fixing issues with monitoring: Rails handles the error alright by inspecting the error details instead of the error code. Changelog: fixed
2022-03-10Update VERSION filesv14.9.0-rc4GitLab Release Tools Bot
[ci skip]
2022-03-10Merge branch 'jc-user-rebase-structured-errors-proto' into 'master'John Cai
proto: Add structured error types for UserRebaseCofirmable See merge request gitlab-org/gitaly!4382
2022-03-10proto: Add structured error types for UserRebaseCofirmableJohn Cai
This commit introduces these changes by creating a new UserRebaseConfirmableError Protobuf message which contains all potential structured errors we want to return from the UserSquash RPC. Changelog: added
2022-03-10Merge branch 'smh-no-implicit-pool-creation' into 'master'Toon Claes
Disable implicit pool creation on link behind a feature flag See merge request gitlab-org/gitaly!4397
2022-03-09Update changelog for 14.7.5GitLab Release Tools Bot
[ci skip]
2022-03-09Merge branch 'pks-doc-git-execution-environments' into 'master'Patrick Steinhardt
doc: Document supported Git execution envinronments See merge request gitlab-org/gitaly!4392
2022-03-09Merge branch 'pks-user-squash-structured-errors' into 'master'John Cai
operations: Implement structured errors for UserSquash See merge request gitlab-org/gitaly!4374
2022-03-09doc: Document supported ways to access Git installationsPatrick Steinhardt
Document the different ways to access Git installations supported by Gitaly. Most importantly, this also documents the way our new bundled Git binaries work and why they were introduced.
2022-03-09Update VERSION filesv14.9.0-rc3GitLab Release Tools Bot
[ci skip]
2022-03-09Merge branch 'add-git2go-squash' into 'master'Patrick Steinhardt
Add squash parameter to git2go merge See merge request gitlab-org/gitaly!4241