## Gitaly Team Process ### Feature flags Gitaly uses feature flags to safely roll out features in production. Feature flags are part of the `context.Context` of each RPC. The `featureflag` package will help you with flow control. Most of this documentation assumes operations on `gitlab.com`. For customers, an [HTTP API is available][ff-api]. In order to roll out feature flags to `gitlab.com`, you should follow the documented rollout process below. Once you have [developed your feature][feature-development] you [start by creating an issue for the rollout][issue-for-feature-rollout]. The "Feature Flag Roll Out" [template for the issue][feature-issue-template] has a checklist for the rest of the steps. [ff-api]: https://docs.gitlab.com/ee/api/features.html#features-flags-api [feature-development]: https://docs.gitlab.com/ee/development/feature_flags/index.html [issue-for-feature-rollout]: https://gitlab.com/gitlab-org/gitaly/-/issues/new?issuable_template=Feature%20Flag%20Roll%20Out [feature-issue-template]: https://gitlab.com/gitlab-org/gitaly/-/blob/master/.gitlab/issue_templates/Feature%20Flag%20Roll%20Out.md #### Use and limitations Feature flags are [enabled through chatops][enable-flags] (which is just a consumer [of the API][ff-api]). In [`#chat-ops-test`][chan-chat-ops-test] try: /chatops run feature list --match gitaly_ If you get a permission error you need to request access first. That can be done [in the `#production` channel][production-request-acl]. For Gitaly, you have to prepend `gitaly_` to your feature flag when enabling or disabling. For example: to check if [`gitaly_go_user_delete_tag`][chan-production] is enabled on staging run: /chatops run feature get gitaly_go_user_delete_tag --staging Note that the full set of chatops features for the Rails environment does not work in Gitaly. E.g. the [`--user` argument does not][bug-user-argument], neither does [enabling by group or project][bug-project-argument]. [enable-flags]: https://docs.gitlab.com/ee/development/feature_flags/controls.html [chan-chat-ops-test]: https://gitlab.slack.com/archives/CB2S7NNDP [production-request-acl]: https://gitlab.slack.com/archives/C101F3796 [chan-production]: https://gitlab.com/gitlab-org/gitaly/-/issues/3371 [bug-user-argument]: https://gitlab.com/gitlab-org/gitaly/-/issues/3385 [bug-project-argument]: https://gitlab.com/gitlab-org/gitaly/-/issues/3386 ### Feature flags issue checklist The rest of this section is help for the individual checklist steps in [the issue template][feature-issue-template]. If this is your first time doing this you might want to first skip ahead to the help below, you'll likely need to file some access requests. #### Feature flag labels The lifecycle of feature flags is monitored via issue labels. When the issue is created from a template it'll be created with [`featureflag::disabled`][featureflag-disabled]. Then as part of the checklist the person rolling it out will add [`featureflag::staging`][featureflag-staging] and [`featureflag::production`][featureflag-production] flags to it. [featureflag-disabled]: https://gitlab.com/gitlab-org/gitaly/-/issues?label_name[]=featureflag%3A%3Adisabled [featureflag-staging]: https://gitlab.com/gitlab-org/gitaly/-/issues?label_name[]=featureflag%3A%3Astaging [featureflag-production]: https://gitlab.com/gitlab-org/gitaly/-/issues?label_name[]=featureflag%3A%3Aproduction #### Is the required code deployed? A quick way to see if your MR is deployed is to check if [the release bot][release-bot] has deployed it to staging, canary or production by checking if the MR has [a `workflow::staging`][deployed-staging], [`workflow::canary`][deployed-canary] or [`workflow::production`][deployed-production] label. The [/help action on gitlab.com][help-action] shows the currently deployed hash. Copy that `HASH` and look at `GITALY_SERVER_VERSION` in [gitlab-org/gitlab.git][gitlab-git] to see what the embedded gitaly version is. Or in [a gitaly.git checkout][gitaly-git] run this to see what commits aren't deployed yet: git fetch git shortlog $(curl -s https://gitlab.com/gitlab-org/gitlab/-/raw/HASH/GITALY_SERVER_VERSION)..origin/master See the [documentation on releases below](#gitaly-releases) for more details on the tagging and release process. [release-bot]: https://gitlab.com/gitlab-release-tools-bot [deployed-staging]: https://gitlab.com/gitlab-org/gitaly/-/merge_requests?state=merged&label_name=workflow%3A%3Aproduction [deployed-canary]: https://gitlab.com/gitlab-org/gitaly/-/merge_requests?state=merged&label_name=workflow%3A%3Aproduction [deployed-production]: https://gitlab.com/gitlab-org/gitaly/-/merge_requests?state=merged&label_name=workflow%3A%3Aproduction [help-action]: https://gitlab.com/help [gitlab-git]: https://gitlab.com/gitlab-org/gitlab/ [gitaly-git]: https://gitlab.com/gitlab-org/gitaly/ #### Do we need a change management issue? #### Enable on staging ##### Prerequisites You'll need chatops access. See [above](#use-and-limitations). ##### Steps Run: `/chatops run feature set gitaly_X true --staging` Where `X` is the name of your feature. #### Test on staging ##### Prerequisites Access to https://staging.gitlab.com/users is not the same as on gitlab.com (or signing in with Google on the @gitlab.com account). You must [request access to it][staging-access-request]. As of December 2020 clicking "Sign in" on https://about.staging.gitlab.com will redirect to https://gitlab.com, so make sure to use the `/users` link. As of writing signing in at [that link][staging-users-link] will land you on the `/users` 404 page once you're logged in. You should then typically manually modify the URL `https://staging.gitlab.com/YOURUSER` (e.g. https://staging.gitlab.com/avar) or another way to get at a test repository, and manually test from there. [staging-access-request]: https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/new?issuable_template=Individual_Bulk_Access_Request [staging-users-link]: https://staging.gitlab.com/users ##### Steps Manually use the feature in whatever way exercises the code paths being enabled. Then enable `X` on staging, with: /chatops run feature set gitaly_X --staging ##### Discussion It's a good idea to run the feature for a full day on staging, this is because there are daily smoke tests that run daily in that environment. These are handled by [gitlab-org/gitlab-qa.git][gitlab-qa-git] [gitlab-qa-git]: https://gitlab.com/gitlab-org/gitlab-qa#how-do-we-use-it #### Enable in production ##### Prerequisites Have you waited enough time with the feature running in the staging environment? Good! ##### Steps To enable your `X` feature at 5/25/50 percent, run: /chatops run feature set gitaly_X 5 /chatops run feature set gitaly_X 25 /chatops run feature set gitaly_X 50 And then finally when you're happy it works properly do: /chatops run feature set gitaly_X 100 Followed by: /chatops run feature set gitaly_X true Note that you need both the `100` and `true` as separate commands. See [the documentation on actor gates][actor-gates] If the feature is left at `50%` but is also set to `true` by default the `50%` will win, even if `OnByDefault: true` is [set for it](#feature-lifecycle-after-it-is-live). It'll only be 100% live once the feature flag code is deleted. So make sure you don't skip the `100%` step. [actor-gates]: https://docs.gitlab.com/ee/development/feature_flags/controls.html#process ##### Discussion What percentages should you pick and how long should you wait? It makes sense to be aggressive about getting to 50% and then 100% as soon as possible. You should use lower percentages only as a paranoia check to make sure that it e.g. doesn't spew errors at users unexpectedly at a high rate, or (e.g. if it invokes a new expensive `git` command) doesn't create runaway load on our servers. But say running at 5% for hours after we've already had sufficient data to demonstrate that we won't be spewing errors or taking down the site just means you're delaying getting more data to be certain that it works properly. Nobody's better off if you wait 10 hours at 1% to get error data you could have waited 1 hour at 10% to get, or just over 10 minutes with close monitoring at 50%. #### Feature lifecycle after it is live ##### Discussion After a feature is running at `100%` for what ever's deemed to be a safe amount of time we should change it to be `OnByDefault: true`. See [this MR for an example][example-on-by-default-mr]. We should add a changelog entry when `OnByDefault: true` is flipped. That should then be followed up by another MR to remove the pre-feature code from the codebase, and we should add another changelog entry when doing that. This is because even after setting `OnByDefault: true` users might still have opted to disable the new feature. See [the discussion below](#two-phase-ruby-to-go-rollouts)) for possibly needing to do such changes over multiple releases. [example-on-by-default-mr]: https://gitlab.com/gitlab-org/gitaly/-/merge_requests/3033 ##### Two phase Ruby to Go rollouts Depending on what the feature does it may be bad to remove the `else` branch where we have the feature disabled at this point. E.g. if it's a rewrite of Ruby code in Go. As we deploy the Ruby code might be in the middle of auto-restarting, so we could remove its code before the Go code has a chance to update with its default, and would still want to call it. So therefore you need to do any such removal in two gitlab.com release cycles. See the example of [MR !3033][example-on-by-default-mr] and [MR !3056][example-post-go-ruby-code-removal-mr] for how to do such a two-phase removal. [example-on-by-default-mr]: https://gitlab.com/gitlab-org/gitaly/-/merge_requests/3033 [example-post-go-ruby-code-removal-mr]: https://gitlab.com/gitlab-org/gitaly/-/merge_requests/3056 ##### Remove the feature flag via chatops After completing the above steps the feature flag should be deleted from the database of available features via `chatops`. If you don't do this others will continue to see the features with e.g.: /chatops run feature list --match=gitaly_ It also incrementally adds to data that needs to be fetched & populated on every request. To remove the flag first sanity check that it's the feature you want, that it's at [`100%` and is `true`](#enable-in-production): /chatops run feature get gitaly_X Then delete it if that's the data you're expecting: /chatops run feature delete gitaly_X ### Gitaly Releases Gitaly releases are tagged automatically by [`release-tools`][release-tools] when a Release Manager tags a GitLab version. [release-tools]: https://gitlab.com/gitlab-org/release-tools #### Major or minor releases Once we release GitLab X.Y.0, we also release gitaly X.Y.0 based on the content of `GITALY_SERVER_VERSION`. This version file is automatically updated by `release-tools` during auto-deploy picking. Because gitaly master is moving we need to take extra care of what we tag. Let's imagine a situation like this on `master` ```mermaid graph LR; A-->b0; A-->B; b0:::branch-->b1; b1:::branch-->B; B-->C; B-->c0; c0:::branch-->C; classDef branch fill:#f96; classDef tag fill:yellow; ``` Commit `C` is picked into auto-deploy and the build is successfully deployed to production We are ready to tag `v12.9.0` but there is a new merge commit, `D`, on gitaly `master`. ```mermaid graph LR; A-->b0; A-->B; b0:::branch-->b1; b1:::branch-->B; B-->C; B-->c0; c0:::branch-->C; C-->D; C-->d0; d0:::branch-->D classDef branch fill:#f96; classDef tag fill:yellow; ``` We cannot tag on `D` as it never reached production. `release-tools` follows this algorithm: 1. create a stable branch from `GITALY_SERVER_VERSION` (commit `C`), 1. bump the version and 1. prepare the changelog (commit `C'`). Then we tag this commit and we merge back to `master` ```mermaid graph LR; A-->b0; A-->B; b0:::branch-->b1; b1:::branch-->B; B-->C; B-->c0; c0:::branch-->C; C-->D; C-->d0; d0:::branch-->D C-->C'; id1>v12.9.0]:::tag-->C'; D-->E; C':::stable-->E; classDef branch fill:#f96; classDef tag fill:yellow; classDef stable fill:green; ``` Legend ```mermaid graph TD; A["master commit"]; b0["feature branch commit"]:::branch; id1>tag]:::tag; C["stable branch commit"]:::stable; classDef branch fill:#f96; classDef tag fill:yellow; classDef stable fill:green; ``` With this solution, the team can autonomously tag any RC they like, but the other releases are handled by the GitLab tagging process. #### Patch releases The Gitaly team usually works on patch releases in the context of a security release. The release automation creates the stable branches, tagging the stable branch is automated in `release-tools` as well. A Gitaly maintainer will only take care of merging the fixes on the stable branch. For patch releases, we don't merge back to master. But `release-tools` will commit a changelog update to both the patch release, and the master branch. #### Creating a release candidate Release candidate (RC) can be created with a chatops command. This is the only type of release that a developer can build autonomously. When working on a GitLab feature that requires a minimum gitaly version, tagging a RC is a good way to make sure the gitlab feature branch has the proper gitaly version. - Pick the current milestone (i.e. 12.9) - Pick a release candidate number, you can check `VERSION` to see if we have one already (12.9.0-rc1) - run `/chatops run gitaly tag 12.9.0-rc1` - The release will be published - The [pipeline of a tag](https://gitlab.com/gitlab-org/gitaly/pipelines?scope=tags&page=1) has a **manual** job, `update-downstream-server-version`, that will create a merge request on the GitLab codebase to bump the Gitaly server version, and this will be assigned to you. Once the build has completed successfully, assign it to a maintainer for review. ### Publishing the ruby gem If an updated version of the ruby proto gem is needed, it can be published to rubygems.org with the `_support/publish-gem` script. If the changes needed are not yet released, [create a release candidate](#creating-a-release-candidate) first. - Checkout the tag to publish (vX.Y.Z) - run `_support/publish-gem X.Y.Z` ### Publishing the go module If an [updated version](https://golang.org/doc/modules/release-workflow) of the go module is needed, it can be [published](https://golang.org/doc/modules/publishing) by tag creation. If a new [major module version update](https://golang.org/doc/modules/major-version) is needed, it can be changed by running `upgrade-module` `make` task with desired parameters: ```bash make upgrade-module FROM_MODULE=v14 TO_MODULE=v15 ``` It replaces old imports with the new version in the go source files, updates `*.proto` files and modifies `go.mod` file to use a new target version of the module. ##### Security release Security releases involve additional processes to ensure that recent releases of GitLab are properly patched while avoiding the leaking of the security details to the public until appropriate. Before beginning work on a security fix, open a new Gitaly issue with the template `Security Release` and follow the instructions at the top of the page for following the template. ### Experimental builds Push the release tag to dev.gitlab.org/gitlab/gitaly. After passing the test suite, the tag will automatically be built and published in https://packages.gitlab.com/gitlab/unstable. ### Patching git The Gitaly project is the single source of truth for the Git distribution across all of GitLab: all downstream distributions use the `make git` target to build and install the git version used at runtime. Given that there is only one central location where we define the git version and its features, this grants us the possibility to easily apply custom patches to git. In order for a custom patch to be accepted into the Gitaly project, it must meet the high bar of being at least in the upstream's `next` branch. The mechanism is thus intended as a process to ensure that we can test upstreamed patches faster than having to wait for the next release, not to add patches which would never be accepted upstream. Patches which were not upstreamed yet will not be accepted: at no point in time do we want to start maintaining a friendly fork of git. In order to add a patch, you can simply add it to the `GIT_PATCHES` array in our `Makefile`. Note: while there is only a single git distribution which is distributed across all of GitLab's official distributions, there may be unoffical ones which use a different version of git (most importantly source-based installations). So even if you add patches to Gitaly's Makefile, you cannot assume that installations will always have these patches. As a result, all code which makes use of patched-in features must have fallback code to support the [minimum required Git version](../README.md#installation)