Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-16Merge branch 'pks-limithandler-remove-globals' into 'master'John Cai
limithandler: Refactor package to not use global state See merge request gitlab-org/gitaly!4197
2021-12-16Merge branch 'pks-gittest-bin-path' into 'master'Sami Hiltunen
gittest: Accept Gitaly configuration instead of binary paths See merge request gitlab-org/gitaly!4203
2021-12-16testdb: Fix stuck PgBouncer connections by killing via admin commandPatrick Steinhardt
The PgBouncer-based test job is really flaky, where the symptom of the flakiness is that we fail to close all connections to the database on test cleanup. Seemingly, PgBouncer will in some situations keep around the connection without us being able to close them in the 20 second timeframe we allow the conection termination to take. While we could try to increase the timeout even further, this is not a good way to guarantee that terminations do indeed connect. Actually, we probably cannot require PgBouncer to really terminate connections only because we kill them in the Postgres backend itself: PgBouncer may reconnect automatically or anything else, so we should treat it as a black box which just does some automagic things we cannot control by issuing commands against Postgres. Luckily, PgBouncer has an admin console which allows us to take control over what it does directly. Instead of trying to terminate connections via Postgres, we can thus ask PgBouncer to terminate connections by issuing a `KILL` command, which will immediately terminate all client and server connections. Convert the test cleanup logic to do so. Given that we have test setups where the same database configuration (and thus database) is reused, we also need to issue a `RESUME` command such that subsequent test cases will be able to reconnect to the database.
2021-12-16testdb: Rename functions to avoid stutteringPatrick Steinhardt
The functions `testdb.NewDB()` and `testdb.GetDBConfig()` both stutter. Rename them to `testdb.New()` and `testdb.GetConfig()`, respectively.
2021-12-16glsql: Move test database implementation into testdb packagePatrick Steinhardt
The glsql package hosts both the implementation used to connect to a "normal" production database as well as the logic to connect to a test database. Mixing production and test code like this is bad practice though. Move test-related functions into the "testdb" package to fix this. This change requires the glsql tests to use a separate package to break a cyclic dependency between the "glsql" and "testdb" package. And furthermore, the linter kicks in after this move and (legitly) complains about missing error checks and closes for rows.
2021-12-16testdb: Rename file hosting logic to update health checksPatrick Steinhardt
The helper function to set up healthy nodes in the test database is hosted in a file "db.go". This has been fine until now given that it was the only function in this package anyway. But we're about to move the logic to set up a test database and connect to it into this package, too, at which point it becomes a bit awkward that it's hosted in such a generic file. Rename the file to "health.go" to prepare for the move.
2021-12-16glsql: Delete unused mock querierPatrick Steinhardt
The `MockQuerier` structure is not used anywhere. Delete it.
2021-12-15gittest: Convert object existence hook to accept configPatrick Steinhardt
The helper function which writes a Git hook which checks for object existence accepts a Git binary path as parameter. Convert it to instead accept a Gitaly configuration.
2021-12-15gittest: Drop CreateCommitInAlternateObjectDirectory helperPatrick Steinhardt
The `CreateCommitInAlternateObjectDirectory()` helper function has been replaced by a new option for `WriteCommit()` and is now unused. Drop it.
2021-12-15commit: Rewrite tests which write commits into alternate ODBsPatrick Steinhardt
Rewrite tests which write commits into alternate ODBs to use `gittest.WriteCommit()`. Note that the tests are converted to use bare Git repositories because the helper doesn't need a working tree. This is closer to how repositories would look like in production.
2021-12-15repository: Rewrite tests which write commits into alternate ODBsPatrick Steinhardt
Rewrite tests which write commits into alternate ODBs to use `gittest.WriteCommit()`. Note that the tests are converted to use bare Git repositories because the helper doesn't need a working tree. This is closer to how repositories would look like in production.
2021-12-15gittest: Add commit option to write objects into alternate ODBPatrick Steinhardt
We have several tests which write commit objects into the alternate object database. While some hand-code this, most use the helper function `CreateCommitInAlternateObjectDirectory()`. This helper requires manual assemlby of the command and is thus unnecessarily verbose to be used. Add a new option to the `WriteCommit()` helper which allows callers to specify an alternate ODB such that it can be used instead of the old helper.
2021-12-15gittest: Refactor object existence helpersPatrick Steinhardt
The object existence helpers exposed by the gittest package have various problems: - They stutter due to their "GitObject" prefix. - They accept the Git binary path instead of a configuration. - The object ID is an untyped string. Refactor them to better match our current coding best practices. They're renamed to `RequireObject{,Not}Exists()`, accept the Gitaly config as parameter and the object ID is now of type `git.ObjectID`.
2021-12-15limithandler: Run tests in parallelPatrick Steinhardt
Run tests in parallel, which is an easy thing to do now that the limithandler package doesn't rely on global state anymore.
2021-12-15limithandler: Refactor package to not use global statePatrick Steinhardt
The limithandler package uses global variables to both track Prometheus metrics and to handle its configuration. Especially the latter is really fragile, where we need to set up state of the limithandler in various places by calling global functions. Fix this by making the `LimiterMiddleware` self-contained, where it hosts all configuration as well as the Prometheus metrics.
2021-12-15gitaly: Inject the limit handlerPatrick Steinhardt
The limit handler is currently created ad-hoc by Gitaly's server code. We're about to change the way it hosts Prometheus metrics though such that we can get rid of a set of global variables, and this will require us to make sure that there's only a single instance of the limiter. Inject the limit handler as a dependency when creating the server to prepare for this.
2021-12-15limithandler: Move per-repository concurrency keyerPatrick Steinhardt
The limithandler supports use of different keys to limit concurrency by via a keyer function. The main function used by production code is hosted in an unrelated package, mostly because it's where the limiter is currently initialized. Move it into the limithandler package to make it available at a central place.
2021-12-15limithandler: Split out monitor logicPatrick Steinhardt
The monitor logic is hosted in both the concurrency limiting code and in the metrics code. Move it into a separate file to make it more self-contained.
2021-12-15limithandler: Reorder functions for better readabilityPatrick Steinhardt
The functions in the limithandler's middleware implementation are seemingly ordered at random. Reorder them to group structures and their receiver functions together. Furthermore, order constructors right after their struct.
2021-12-15limithandler: Rename file hosting middleware implementationPatrick Steinhardt
The limithandler package implements both the concurrency-limiting primitives as well as the middleware used by gRPC. The files are confusingly named though, so it's not clear which file hosts which logic. Rename the "limithandler" file to "middleware" to clarify.
2021-12-14Merge branch 'jc-repositories-missing-primary' into 'master'John Cai
praefect: add missing primaries check See merge request gitlab-org/gitaly!4176
2021-12-14cmd/praefect: Add missing primaries checkJohn Cai
Add a praefect startup check to get the number of repositories that have a missing primary. This check fails if there are any repositories are deemed unavailable. Changelog: added
2021-12-14datastore: Export query that counts unavailable reposJohn Cai
Export the function that queries for unavailable repositories in preparation for a praefect startup check to utilizie it.
2021-12-14lint: Disallow use of "normal" contextsPatrick Steinhardt
Our testhelper context has recently gained the ability to verify that feature flags are exercised as expected. This is an important safety guard which can help us to avoid releasing untested and thus potentially buggy code. Disallow usage of "normal" contexts which don't have this ability such that we do not introduce their usage by accident.
2021-12-14tests: Convert to use testhelper contextsPatrick Steinhardt
We're about to disallow use of "normal" contexts created via the `context` package given that they do not perform sanity checks for feature flags. Instead, contexts should be created via the testhelper package. Convert tests to use the testhelper context.
2021-12-14cache: Rework keyer tests to be independentPatrick Steinhardt
The keyer tests are all dependent on each other, and furthermore they're written in terms of one long test. This makes it hard to see the different parts we're trying to test, and furthermore it makes it harder than necessary to refactor those tests. Split up these tests into several subtests and have each of them use a separate cache such that there can be no interdependencies.
2021-12-14testhelper: Add ability to create uncancellable contextsPatrick Steinhardt
In some contexts, we want to create test contexts which are not cancellable to exercise certain behaviour in case it is missing. It's still useful to inject feature flags in these cases though, so one shouldn't just reach for `context.Background()` in such a case. Introduce a new function `ContextWithoutCancel()` to address this usecase.
2021-12-14testhelper: Drop cancel function from context optionsPatrick Steinhardt
There are no context options anymore which would set up a separate way to cancel the context given that deadlines and similar things are forbidden nowadays. As a result, the `func()` return value of context options isn't used at all anymore. Simplify context creation by dropping the cancel function from context options.
2021-12-14testhelper: Simplify feature set testPatrick Steinhardt
The test for `FeatureSet`s is needlessly complex, doing a lot of voodoo with converting contexts. Simplify the test to instead just assert that incoming and outgoing metadata are exactly the same such that we can then proceed to only whether feature flags are set in one of both tests.
2021-12-14featureflag: Fix setting incoming feature flags modifying parent contextPatrick Steinhardt
When setting feature flags on a context, then the expectation is that it will only be set for the child context which is returned from the function. This doesn't work as expected though: the metadata we retrieve from the current context is not a copy, so setting any new values on it will modify the parent context, as well. Fix this bug by copying the context before setting the key. Somehow, this bug only seemed to impact incoming contexts: I wasn't able to demonstrate that this is broken for outgoing contexts, even though they should in theory suffer from the exact same problem. This commit fixes the issue for both incoming and outgoing contexts regardless. Changelog: fixed
2021-12-14featureflag: Demonstrate feature flags modifying parent contextPatrick Steinhardt
When appending feature flags to a context, the expectation is that the child context will have the feature flag set, but the parent context will remain unmodified. This expectation is broken for incoming contexts though, where setting a feature flag will set it in both contexts. Add a test to demonstrate that this is broken.
2021-12-14Merge branch 'wc-ssh-err-log' into 'master'James Fargher
ssh: Log error on git command failure Closes #3865 See merge request gitlab-org/gitaly!4173
2021-12-13Merge branch 'pks-bootstrap-rewrite-tests' into 'master'John Cai
bootstrap: Rewrite tests to not use timeouts See merge request gitlab-org/gitaly!4188
2021-12-13ssh: Log error on git command failureWill Chandler
If the `git upload-pack` process forked by `SSHUploadPack` or `git upload-archive` from `SSHUploadArchive` exits with a non-zero return code, the result of executing `stream.Send` to respond to the client will be returned, causing the request to be seen as successful by Gitaly. Clients will correctly receive the non-zero return code, but a status of `OK` will be logged and the `grpc_server_handled_total` metric will increment for `grpc_code=OK`. This commit corrects `SSHUploadPack` and `SSHArchivePack` to return an error when git fails. Changelog: fixed
2021-12-13Merge branch 'jc-replicate-immediately' into 'master'John Cai
praefect: replicate immediately after track-repository Closes #3892 and #3909 See merge request gitlab-org/gitaly!4183
2021-12-13cmd/praefect: replicate immediately after track-repositoryJohn Cai
track-repository is a subcommand that tracks a repository in the praefect database. This is generally used by an admin when a repository is on disk but not tracked in the praefect db. Once tracked, replication to the other nodes takes about 5 minutes and requires the Praefect process to be running. This MR adds a flag that, if set, will immediately process a relication job and wait for it to finish. Changelog: added
2021-12-13Merge branch 'pks-enable-atomic-rename-repository' into 'master'Pavlo Strokov
repository: Always enable locking RenameRepository RPC Closes #3950 See merge request gitlab-org/gitaly!4194
2021-12-13bootstrap: Use ticker to get rid of test timeoutsPatrick Steinhardt
The bootstrapping tests still need to use timeouts because of the grace period, which is the duration we give the server to exit before we force the upgrade. Convert the interface to instead accept tickers such that we can get rid of using timeouts in our tests.
2021-12-13bootstrap: Rewrite tests to not use HTTP serverPatrick Steinhardt
Our tests for the bootstrap package are quite complex and really hard to understand. Furthermore, they're using timeouts everywhere, which slows down execution of the tests, futher obscures what we actually want to test, and has the potential for flakiness. Rewrite the tests to not rely on the HTTP server anymore. This server was used to simulate slow requests which caused us to exceed the grace period in some cases, but doing so via an HTTP server is roundabout: we can just block test execution via channels in a much easier way. Instead, we now use a set of channels which allow us control when the upgrader is doing various things, like when the upgrade process starts or when the parent process supposedly exits. This allows us to directly exercise what we want to in a much more direct way. Furthermore, tests are refactored to not use real listeners anymore. We do not care about whether the listeners work as expected: lifetime wouldn't be controlled by the bootstrapper anyway, but instead by the shutdown function that's provided. Using a mock listener thus allows us to test the effects we want to see on the listeners directly.
2021-12-13Merge branch 'pks-praefect-remove-repository-locking-semantics' into 'master'Pavlo Strokov
praefect: Support new locking semantics in RemoveRepository handler See merge request gitlab-org/gitaly!4187
2021-12-13repository: Always enable locking RenameRepository RPCPatrick Steinhardt
We have been pushing our RPCs which manage repository locations to always use locking semantics on the source repository and, if exists, on the target repository. Like this, we cannot ever have two RPCs which try to create and/or remove a repository location at the same point in time. This allows us to properly use transactional voting for these RPCs. In 0118edab0 (repository: Implement locking for RenameRepository, 2021-12-06), we have introduced locking support for the RenameRepository RPC. This change has been successfull rolled out to production without any issues on December 10th, so the change seems to work as intended. Remove the feature flag and always enable locking semantics for RenameRepository. Changelog: changed
2021-12-13Merge branch 'pks-ff-tx-atomic-repository-creation-default-enable' into 'master'Patrick Steinhardt
featureflag: Enable atomic repository creation by default See merge request gitlab-org/gitaly!4181
2021-12-13featureflag: Enable atomic repository creation by defaultPatrick Steinhardt
With f87bc1e98 (Merge branch 'pks-create-repository-atomic' into 'master', 2021-11-24), we have implemented support for atomic repository creation. With this mode, creation of repositories will use locking and meaningful transactional voting to create repos in a race-free manner. The feature flag guarding the new behaviour has been rolled out on December 6th, and no issues have surfaced since then. But the change is rather high-risk given that it changes semantics of some of the RPCs to return errors in case the target repository exists already, which wasn't the case previously. To be on the safe side, we thus want to keep this feature flag for at least one release such that we have an escape hatch in case there are issues with this change which we haven't yet seen. Default-enable the flag. Changelog: changed
2021-12-13transaction: Wire up transactional voting phasesPatrick Steinhardt
Convert all callsites which perform transactional voting to specify the phase the vote is being cast in. This information is not yet used by Praefect yet, but serves as the baseline to make more informed replication decisions in Praefect's at some point. Ideally, all sites would either use "prepared" or "committed" as their phase. But we have some callsites which don't use proper two-phase voting, which thus cast their votes with "unknown" phase. These will need to get converted eventually.
2021-12-13tests: Convert to use `transaction.TrackingManager`Patrick Steinhardt
Convert tests to use the new `transaction.TrackingManager`, which simply records all votes which have been cast.
2021-12-13voting: Add function to create vote from hex-encoded stringPatrick Steinhardt
We'll need to create votes from hex-encoded strings. Add a helper function which does this for us.
2021-12-13transaction: Introduce tracking managerPatrick Steinhardt
A lot of tests need to create a mock transaction manager to track the transactional votes and assert that we've got the votes we expected. This is quite repetitive and ultimately requires us to change a lot of function signatures if the `transaction.Manager` interface changes. Improve the situation by introducing a new `TrackingManager` which records all votes.
2021-12-11praefect: change replication manager to use field loggerJohn Cai
Passing in an interface makes for a more flexible signature and allows for easier testing with other code.
2021-12-10backup: Replace use of standard contexts with testhelper onesPatrick Steinhardt
Tests should always use the testhelper package to create contexts such that we can benefit from the feature flag sanity checks. Convert backup tests to do so.
2021-12-10backup: Parallelize testsPatrick Steinhardt
Use better parallelization for tests by adding `t.Parallel()` calls as required. Like this, tests drop from 28 seconds of execution time to only 8 seconds.