gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-12-16	Merge branch 'pks-limithandler-remove-globals' into 'master'	John Cai
	limithandler: Refactor package to not use global state See merge request gitlab-org/gitaly!4197
2021-12-16	Merge branch 'pks-gittest-bin-path' into 'master'	Sami Hiltunen
	gittest: Accept Gitaly configuration instead of binary paths See merge request gitlab-org/gitaly!4203
2021-12-16	testdb: Fix stuck PgBouncer connections by killing via admin command	Patrick Steinhardt
	The PgBouncer-based test job is really flaky, where the symptom of the flakiness is that we fail to close all connections to the database on test cleanup. Seemingly, PgBouncer will in some situations keep around the connection without us being able to close them in the 20 second timeframe we allow the conection termination to take. While we could try to increase the timeout even further, this is not a good way to guarantee that terminations do indeed connect. Actually, we probably cannot require PgBouncer to really terminate connections only because we kill them in the Postgres backend itself: PgBouncer may reconnect automatically or anything else, so we should treat it as a black box which just does some automagic things we cannot control by issuing commands against Postgres. Luckily, PgBouncer has an admin console which allows us to take control over what it does directly. Instead of trying to terminate connections via Postgres, we can thus ask PgBouncer to terminate connections by issuing a `KILL` command, which will immediately terminate all client and server connections. Convert the test cleanup logic to do so. Given that we have test setups where the same database configuration (and thus database) is reused, we also need to issue a `RESUME` command such that subsequent test cases will be able to reconnect to the database.
2021-12-16	testdb: Rename functions to avoid stuttering	Patrick Steinhardt
	The functions `testdb.NewDB()` and `testdb.GetDBConfig()` both stutter. Rename them to `testdb.New()` and `testdb.GetConfig()`, respectively.
2021-12-16	glsql: Move test database implementation into testdb package	Patrick Steinhardt
	The glsql package hosts both the implementation used to connect to a "normal" production database as well as the logic to connect to a test database. Mixing production and test code like this is bad practice though. Move test-related functions into the "testdb" package to fix this. This change requires the glsql tests to use a separate package to break a cyclic dependency between the "glsql" and "testdb" package. And furthermore, the linter kicks in after this move and (legitly) complains about missing error checks and closes for rows.
2021-12-16	testdb: Rename file hosting logic to update health checks	Patrick Steinhardt
	The helper function to set up healthy nodes in the test database is hosted in a file "db.go". This has been fine until now given that it was the only function in this package anyway. But we're about to move the logic to set up a test database and connect to it into this package, too, at which point it becomes a bit awkward that it's hosted in such a generic file. Rename the file to "health.go" to prepare for the move.
2021-12-16	glsql: Delete unused mock querier	Patrick Steinhardt
	The `MockQuerier` structure is not used anywhere. Delete it.
2021-12-15	gittest: Convert object existence hook to accept config	Patrick Steinhardt
	The helper function which writes a Git hook which checks for object existence accepts a Git binary path as parameter. Convert it to instead accept a Gitaly configuration.
2021-12-15	gittest: Drop CreateCommitInAlternateObjectDirectory helper	Patrick Steinhardt
	The `CreateCommitInAlternateObjectDirectory()` helper function has been replaced by a new option for `WriteCommit()` and is now unused. Drop it.
2021-12-15	commit: Rewrite tests which write commits into alternate ODBs	Patrick Steinhardt
	Rewrite tests which write commits into alternate ODBs to use `gittest.WriteCommit()`. Note that the tests are converted to use bare Git repositories because the helper doesn't need a working tree. This is closer to how repositories would look like in production.
2021-12-15	repository: Rewrite tests which write commits into alternate ODBs	Patrick Steinhardt
	Rewrite tests which write commits into alternate ODBs to use `gittest.WriteCommit()`. Note that the tests are converted to use bare Git repositories because the helper doesn't need a working tree. This is closer to how repositories would look like in production.
2021-12-15	gittest: Add commit option to write objects into alternate ODB	Patrick Steinhardt
	We have several tests which write commit objects into the alternate object database. While some hand-code this, most use the helper function `CreateCommitInAlternateObjectDirectory()`. This helper requires manual assemlby of the command and is thus unnecessarily verbose to be used. Add a new option to the `WriteCommit()` helper which allows callers to specify an alternate ODB such that it can be used instead of the old helper.
2021-12-15	gittest: Refactor object existence helpers	Patrick Steinhardt
	The object existence helpers exposed by the gittest package have various problems: - They stutter due to their "GitObject" prefix. - They accept the Git binary path instead of a configuration. - The object ID is an untyped string. Refactor them to better match our current coding best practices. They're renamed to `RequireObject{,Not}Exists()`, accept the Gitaly config as parameter and the object ID is now of type `git.ObjectID`.
2021-12-15	limithandler: Run tests in parallel	Patrick Steinhardt
	Run tests in parallel, which is an easy thing to do now that the limithandler package doesn't rely on global state anymore.
2021-12-15	limithandler: Refactor package to not use global state	Patrick Steinhardt
	The limithandler package uses global variables to both track Prometheus metrics and to handle its configuration. Especially the latter is really fragile, where we need to set up state of the limithandler in various places by calling global functions. Fix this by making the `LimiterMiddleware` self-contained, where it hosts all configuration as well as the Prometheus metrics.
2021-12-15	gitaly: Inject the limit handler	Patrick Steinhardt
	The limit handler is currently created ad-hoc by Gitaly's server code. We're about to change the way it hosts Prometheus metrics though such that we can get rid of a set of global variables, and this will require us to make sure that there's only a single instance of the limiter. Inject the limit handler as a dependency when creating the server to prepare for this.
2021-12-15	limithandler: Move per-repository concurrency keyer	Patrick Steinhardt
	The limithandler supports use of different keys to limit concurrency by via a keyer function. The main function used by production code is hosted in an unrelated package, mostly because it's where the limiter is currently initialized. Move it into the limithandler package to make it available at a central place.
2021-12-15	limithandler: Split out monitor logic	Patrick Steinhardt
	The monitor logic is hosted in both the concurrency limiting code and in the metrics code. Move it into a separate file to make it more self-contained.
2021-12-15	limithandler: Reorder functions for better readability	Patrick Steinhardt
	The functions in the limithandler's middleware implementation are seemingly ordered at random. Reorder them to group structures and their receiver functions together. Furthermore, order constructors right after their struct.
2021-12-15	limithandler: Rename file hosting middleware implementation	Patrick Steinhardt
	The limithandler package implements both the concurrency-limiting primitives as well as the middleware used by gRPC. The files are confusingly named though, so it's not clear which file hosts which logic. Rename the "limithandler" file to "middleware" to clarify.
2021-12-14	Merge branch 'jc-repositories-missing-primary' into 'master'	John Cai
	praefect: add missing primaries check See merge request gitlab-org/gitaly!4176
2021-12-14	cmd/praefect: Add missing primaries check	John Cai
	Add a praefect startup check to get the number of repositories that have a missing primary. This check fails if there are any repositories are deemed unavailable. Changelog: added
2021-12-14	datastore: Export query that counts unavailable repos	John Cai
	Export the function that queries for unavailable repositories in preparation for a praefect startup check to utilizie it.
2021-12-14	lint: Disallow use of "normal" contexts	Patrick Steinhardt
	Our testhelper context has recently gained the ability to verify that feature flags are exercised as expected. This is an important safety guard which can help us to avoid releasing untested and thus potentially buggy code. Disallow usage of "normal" contexts which don't have this ability such that we do not introduce their usage by accident.
2021-12-14	tests: Convert to use testhelper contexts	Patrick Steinhardt
	We're about to disallow use of "normal" contexts created via the `context` package given that they do not perform sanity checks for feature flags. Instead, contexts should be created via the testhelper package. Convert tests to use the testhelper context.
2021-12-14	cache: Rework keyer tests to be independent	Patrick Steinhardt
	The keyer tests are all dependent on each other, and furthermore they're written in terms of one long test. This makes it hard to see the different parts we're trying to test, and furthermore it makes it harder than necessary to refactor those tests. Split up these tests into several subtests and have each of them use a separate cache such that there can be no interdependencies.
2021-12-14	testhelper: Add ability to create uncancellable contexts	Patrick Steinhardt
	In some contexts, we want to create test contexts which are not cancellable to exercise certain behaviour in case it is missing. It's still useful to inject feature flags in these cases though, so one shouldn't just reach for `context.Background()` in such a case. Introduce a new function `ContextWithoutCancel()` to address this usecase.
2021-12-14	testhelper: Drop cancel function from context options	Patrick Steinhardt
	There are no context options anymore which would set up a separate way to cancel the context given that deadlines and similar things are forbidden nowadays. As a result, the `func()` return value of context options isn't used at all anymore. Simplify context creation by dropping the cancel function from context options.
2021-12-14	testhelper: Simplify feature set test	Patrick Steinhardt
	The test for `FeatureSet`s is needlessly complex, doing a lot of voodoo with converting contexts. Simplify the test to instead just assert that incoming and outgoing metadata are exactly the same such that we can then proceed to only whether feature flags are set in one of both tests.
2021-12-14	featureflag: Fix setting incoming feature flags modifying parent context	Patrick Steinhardt
	When setting feature flags on a context, then the expectation is that it will only be set for the child context which is returned from the function. This doesn't work as expected though: the metadata we retrieve from the current context is not a copy, so setting any new values on it will modify the parent context, as well. Fix this bug by copying the context before setting the key. Somehow, this bug only seemed to impact incoming contexts: I wasn't able to demonstrate that this is broken for outgoing contexts, even though they should in theory suffer from the exact same problem. This commit fixes the issue for both incoming and outgoing contexts regardless. Changelog: fixed
2021-12-14	featureflag: Demonstrate feature flags modifying parent context	Patrick Steinhardt
	When appending feature flags to a context, the expectation is that the child context will have the feature flag set, but the parent context will remain unmodified. This expectation is broken for incoming contexts though, where setting a feature flag will set it in both contexts. Add a test to demonstrate that this is broken.
2021-12-14	Merge branch 'wc-ssh-err-log' into 'master'	James Fargher
	ssh: Log error on git command failure Closes #3865 See merge request gitlab-org/gitaly!4173
2021-12-13	Merge branch 'pks-bootstrap-rewrite-tests' into 'master'	John Cai
	bootstrap: Rewrite tests to not use timeouts See merge request gitlab-org/gitaly!4188
2021-12-13	ssh: Log error on git command failure	Will Chandler
	If the `git upload-pack` process forked by `SSHUploadPack` or `git upload-archive` from `SSHUploadArchive` exits with a non-zero return code, the result of executing `stream.Send` to respond to the client will be returned, causing the request to be seen as successful by Gitaly. Clients will correctly receive the non-zero return code, but a status of `OK` will be logged and the `grpc_server_handled_total` metric will increment for `grpc_code=OK`. This commit corrects `SSHUploadPack` and `SSHArchivePack` to return an error when git fails. Changelog: fixed
2021-12-13	Merge branch 'jc-replicate-immediately' into 'master'	John Cai
	praefect: replicate immediately after track-repository Closes #3892 and #3909 See merge request gitlab-org/gitaly!4183
2021-12-13	cmd/praefect: replicate immediately after track-repository	John Cai
	track-repository is a subcommand that tracks a repository in the praefect database. This is generally used by an admin when a repository is on disk but not tracked in the praefect db. Once tracked, replication to the other nodes takes about 5 minutes and requires the Praefect process to be running. This MR adds a flag that, if set, will immediately process a relication job and wait for it to finish. Changelog: added
2021-12-13	Merge branch 'pks-enable-atomic-rename-repository' into 'master'	Pavlo Strokov
	repository: Always enable locking RenameRepository RPC Closes #3950 See merge request gitlab-org/gitaly!4194
2021-12-13	bootstrap: Use ticker to get rid of test timeouts	Patrick Steinhardt
	The bootstrapping tests still need to use timeouts because of the grace period, which is the duration we give the server to exit before we force the upgrade. Convert the interface to instead accept tickers such that we can get rid of using timeouts in our tests.
2021-12-13	bootstrap: Rewrite tests to not use HTTP server	Patrick Steinhardt
	Our tests for the bootstrap package are quite complex and really hard to understand. Furthermore, they're using timeouts everywhere, which slows down execution of the tests, futher obscures what we actually want to test, and has the potential for flakiness. Rewrite the tests to not rely on the HTTP server anymore. This server was used to simulate slow requests which caused us to exceed the grace period in some cases, but doing so via an HTTP server is roundabout: we can just block test execution via channels in a much easier way. Instead, we now use a set of channels which allow us control when the upgrader is doing various things, like when the upgrade process starts or when the parent process supposedly exits. This allows us to directly exercise what we want to in a much more direct way. Furthermore, tests are refactored to not use real listeners anymore. We do not care about whether the listeners work as expected: lifetime wouldn't be controlled by the bootstrapper anyway, but instead by the shutdown function that's provided. Using a mock listener thus allows us to test the effects we want to see on the listeners directly.
2021-12-13	Merge branch 'pks-praefect-remove-repository-locking-semantics' into 'master'	Pavlo Strokov
	praefect: Support new locking semantics in RemoveRepository handler See merge request gitlab-org/gitaly!4187
2021-12-13	repository: Always enable locking RenameRepository RPC	Patrick Steinhardt
	We have been pushing our RPCs which manage repository locations to always use locking semantics on the source repository and, if exists, on the target repository. Like this, we cannot ever have two RPCs which try to create and/or remove a repository location at the same point in time. This allows us to properly use transactional voting for these RPCs. In 0118edab0 (repository: Implement locking for RenameRepository, 2021-12-06), we have introduced locking support for the RenameRepository RPC. This change has been successfull rolled out to production without any issues on December 10th, so the change seems to work as intended. Remove the feature flag and always enable locking semantics for RenameRepository. Changelog: changed
2021-12-13	Merge branch 'pks-ff-tx-atomic-repository-creation-default-enable' into 'master'	Patrick Steinhardt
	featureflag: Enable atomic repository creation by default See merge request gitlab-org/gitaly!4181
2021-12-13	featureflag: Enable atomic repository creation by default	Patrick Steinhardt
	With f87bc1e98 (Merge branch 'pks-create-repository-atomic' into 'master', 2021-11-24), we have implemented support for atomic repository creation. With this mode, creation of repositories will use locking and meaningful transactional voting to create repos in a race-free manner. The feature flag guarding the new behaviour has been rolled out on December 6th, and no issues have surfaced since then. But the change is rather high-risk given that it changes semantics of some of the RPCs to return errors in case the target repository exists already, which wasn't the case previously. To be on the safe side, we thus want to keep this feature flag for at least one release such that we have an escape hatch in case there are issues with this change which we haven't yet seen. Default-enable the flag. Changelog: changed
2021-12-13	transaction: Wire up transactional voting phases	Patrick Steinhardt
	Convert all callsites which perform transactional voting to specify the phase the vote is being cast in. This information is not yet used by Praefect yet, but serves as the baseline to make more informed replication decisions in Praefect's at some point. Ideally, all sites would either use "prepared" or "committed" as their phase. But we have some callsites which don't use proper two-phase voting, which thus cast their votes with "unknown" phase. These will need to get converted eventually.
2021-12-13	tests: Convert to use `transaction.TrackingManager`	Patrick Steinhardt
	Convert tests to use the new `transaction.TrackingManager`, which simply records all votes which have been cast.
2021-12-13	voting: Add function to create vote from hex-encoded string	Patrick Steinhardt
	We'll need to create votes from hex-encoded strings. Add a helper function which does this for us.
2021-12-13	transaction: Introduce tracking manager	Patrick Steinhardt
	A lot of tests need to create a mock transaction manager to track the transactional votes and assert that we've got the votes we expected. This is quite repetitive and ultimately requires us to change a lot of function signatures if the `transaction.Manager` interface changes. Improve the situation by introducing a new `TrackingManager` which records all votes.
2021-12-11	praefect: change replication manager to use field logger	John Cai
	Passing in an interface makes for a more flexible signature and allows for easier testing with other code.
2021-12-10	backup: Replace use of standard contexts with testhelper ones	Patrick Steinhardt
	Tests should always use the testhelper package to create contexts such that we can benefit from the feature flag sanity checks. Convert backup tests to do so.
2021-12-10	backup: Parallelize tests	Patrick Steinhardt
	Use better parallelization for tests by adding `t.Parallel()` calls as required. Like this, tests drop from 28 seconds of execution time to only 8 seconds.