Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-10-11WIP: Demonstrate failure in quarantined testtoon-quarantine-testsToon Claes
This change should not be merged.
2021-10-11datastore: Quarantine test in TestStorageCleanupToon Claes
The test case TestStorageCleanup_AcquireNextStorage is flaky. To stop it from blocking deploys, put it in quarantine. Issue: https://gitlab.com/gitlab-org/gitaly/-/issues/3828
2021-10-11ci: Remove lint jobToon Claes
The verify job runs, amongst other things, the lint target in the Makefile. So the job to run `make lint` is redundant, therefore this change removes the lint job.
2021-10-11ci: Add job to run quarantined testsToon Claes
This change adds a jobs that also runs the quarantined tests, but the job is marked to allow failures so it doesn't block deploys when the pipeline on master is red due to flaky tests.
2021-10-11ci: Move junit report artifact to test_templateToon Claes
Specify the location of the junit report in the test_template anchor so every job using the anchor has it.
2021-10-11ci: Default image uses default versionsToon Claes
The CI yaml file specifies default versions for tools like Ruby and Golang. But the default image still uses older versions of these tools. This change uses the default versions of these tools in the default image. This removes the need the need to specify the image in many jobs.
2021-10-11ci: Remove junit report from artifactsToon Claes
The junit report is mentioned as normal artifact *and* report. This is not needed, so remove it from the regular artifacts.
2021-10-11testhelper: Add function to quarantine a testToon Claes
We are having issues with some tests being flaky and blocking deploys because the pipeline on master is failing. This annoying, but often it's not easy to quickly fix the flaky test. This change allows us to put flaky tests in quarantine. The quarantined tests are run in a separate job, but that job is marked to allow to fail. This allows us to make the pipeline green again quickly, but still have the tests running.
2021-10-11Merge branch 'pks-git-mandatory-v2.33.0' into 'master'Patrick Steinhardt
git: Bump minimum required Git version to v2.33.0 Closes #3815 See merge request gitlab-org/gitaly!3931
2021-10-11Merge branch 'pks-testhelper-gitaly-health-checks' into 'master'Toon Claes
testserver: Fix Praefect resource exhaustion and assert Gitaly's healthiness See merge request gitlab-org/gitaly!3945
2021-10-11git: Always assume we can handle object type filtersPatrick Steinhardt
Now that the minimum required Git version is v2.33.0 and newer, we can assume that git-rev-list(1) can always handle object type filters. Drop `SupportsObjectTypeFilter()` and adjust callsites accordingly. Given that we now always set up those filters unconditionally, this commit also gets rid of any additional pipeline steps which verify that the current object's type is indeed the expected type. Note that in some cases, we can get rid of the `CatfileInfo()` step completely because we already know the type of the object, which we could thus directly pass to the `CatfileObject()` pipeline step. This refactoring is deferred to a later point in time though such that pending refactorings for the catfile cache can land first.
2021-10-11git: Bump minimum required Git version to v2.33.0Patrick Steinhardt
Recently, we have seen a lot of issues with git-cat-file(1) due to inefficiencies. In many cases, we can improve the code to elide additional filtering by using new object type filters which have been introduced for git-rev-list(1) in Git v2.33.0, which in some cases may even give us the ability to drop some pipeline steps altogether. Bump the minimum required Git version to v2.33.0 such that we can always make use of these object type filters. Changelog: deprecated
2021-10-11git: Fix test for FlushesUpdaterefStatus exercising wrong functionPatrick Steinhardt
Tests for `FlushesUpdaterefStatus()` exercise the wrong function by accident. Fix them to test the correct function.
2021-10-11test-boot: Set up Git binary pathPatrick Steinhardt
We use the test-boot script to determine whether the Gitaly server successfully boots up with a given Gitaly configuration. This config currently not contain the path to the Git binary though, which is why it will instead pick up the system-provided Git binary. This wouldn't typically be the correct version though, given that we'd likely want to boot Gitaly with the Git executable produced by `make git`. The result is that Gitaly would refuse to boot if the system-provided version of Git does not meet the minimum required version. Fix the issue by setting the Git binary path.
2021-10-11ci: Drop tests for older Git versionsPatrick Steinhardt
We're about to drop support for Git versions older than v2.33.0. Drop tests for these versions from the CI definitions.
2021-10-11Merge branch 'qmnguyen0711/enable-praefect-sidechannel-test' into 'master'James Fargher
Enable Praefect in PostUploadPackWithSidechannel tests See merge request gitlab-org/gitaly!3938
2021-10-11Merge branch 'ps-repocleaner-test-enhance' into 'master'James Fargher
test: Refactor for better readability See merge request gitlab-org/gitaly!3932
2021-10-08testserver: Merge functions which start Gitaly serverPatrick Steinhardt
Now that we always assert that Gitaly is healthy before returning from both `RunGitalyServer()` and `StartGitalyServer()`, the implementations of both functions are exactly the same, except that one returns only the address of the started server while the other returns a structure. Merge both implementations.
2021-10-08testserver: Always assert that Gitaly can serve requestsPatrick Steinhardt
When starting Gitaly with Praefect, then we always wait until Gitaly becomes healthy before we return from the function such that Praefect can correctly route information to it. It does make sense to wait for Gitaly to be healthy in the general case though such that we don't ever run with Gitaly not yet being able to server requests. Improve the situation by always asserting that Gitaly is healthy when `RunGitalyServer()` or `StartGitalyServer()` returns. This is likely to fix some flaky tests where we have seen Gitaly to not be able to serve requests.
2021-10-08testserver: Move `StartGitalyServer()` next to `RunGitalyServer()`Patrick Steinhardt
Both `StartGitalyServer()` and `RunGitalyServer()` essentially do the same thing -- the only thing that's differen tis that one returns the address of the started server, while the other one returns a `GitalyServer` structure. Move them next to each other to group them together. This is a preparatory patch towards unifying both implementations.
2021-10-08testserver: Clean up Praefect proxy if it fails to become healthyPatrick Steinhardt
The Praefect proxy is started via a separate executable which we'll automatically inject when a certain environment variable is set. If the Praefect proxy fails to come up and become healthy though, we leak this executable because we don't kill it if the test fails. As a result, there is a misleading panic at the end of the test run hinting at a leaked process, while the real issue is that Praefect failed to become healthy at all. Fix this by queueing shutdown of the process before checking for Praefect to become healthy.
2021-10-08gitaly-ssh: Inject test certificate into system storePatrick Steinhardt
One of the connectivity tests for gitaly-ssh uses a TLS-based setup to assert that we're able to connect to them. The self-signed TLS certificate we're using for this setup keeps us from connecting to the Gitaly server directly. While we don't do that right now in the first place, it will break as soon as we start to do a health check for test Gitaly servers because we cannot connect to it in the first place. Fix this issue by injecting self-signed certificates into the system store. While at it, generate certs on the fly instead of using pre-seeded test data.
2021-10-08testserver: Use "client" package to dial health serverPatrick Steinhardt
We're using the grpc package directly to dial to the health server when waiting for it to become healthy. This dialer doesn't know to set up various things for us: it doesn't understand the "tcp://" and "tls://" schemas, it won't handle setup of transport credentials and doesn't set up interceptors. This has worked fine until now because we only used it to check the Praefect server's health, but it will fail as soon as we want to also check the Gitaly server's health. Use our own "client" package to dial the health service such that we can extend support for `waitHealthy()` to Gitaly servers.
2021-10-08testserver: Drop retry when waiting for server to become healthyPatrick Steinhardt
When executing `waitHealthy()`, then we retry connecting to the health client, where each retry is limited to 1 second. This shouldn't be necessary though: - `grpc.WaitForReady()` ensures that we retry connecting to the service until we're either successful or until the context times out. It's thus useless to re-dial. - When instantiating the health server, it will always be initialized in "serving" state for the default service. Furthermore, we only ever check for the default service and never modify the serving status of the server. As such, if we ever receive an answer from the server, then it must be that it is serving. It's thus useless to retry the RPC call. In combination, it means that we do not need to loop at all in `waitHealthy()`. Simplify the function to not loop, but instead use a 3 second timeout both for dialing the server and executing the health check.
2021-10-08testserver: Drop parameters for `waitHealthy()`Patrick Steinhardt
We always pass the same timeout parameters to `waitHealthy()`, so there is not much of a point to keep them. Hardcode timeouts in `waitHealthy()` itself and get rid of these parameters.
2021-10-08testserver: Fix resource exhaustion with too many Praefect instancesPatrick Steinhardt
When using the "test-with-praefect" target, then most tests will fork an extra Praefect instance which proxies connections to Gitaly. With our recent push to use `t.Parallel()`, this means that there are now potentially hundreds of concurrent Praefect instances created which both consume a lot of memory and exhaust the connection pool of the Postgres database. The result is out-of-memory situations and tests failing because of too many clients connected to Postgres. Fix this by limiting concurrency to at most 16 instances of Praefect.
2021-10-08Merge branch 'pks-supervisor-synchronize-kill' into 'master'Toon Claes
supervisor: Close writers before signalling process is done See merge request gitlab-org/gitaly!3940
2021-10-08supervisor: Close writers before signalling process is donePatrick Steinhardt
When a supervised process is going down, then we need to close its associated stdout and stderr loggers. This is done after we signal to the caller that the process is done though by closing the wait channel, which in turn means that any Goroutines spawned by the writers may still be active at the time we're exiting. Fix this by instead deferring the close. While this fixes the issue, it also avoids any deadlocks due to an unclosed wait channel in case any of the invoked function calls panics.
2021-10-08supervisor: Drop global test statePatrick Steinhardt
Supervisor tests require a test server executable to have been built, which is done in the global test suite configuration. This results in global state we don't really want to have and leaves behind temporary directories. Refactor the code to build the test server on demand using `BuildBinary()`. The end result is both cleaner test setup and faster tests given that we can now run tests in parallel.
2021-10-08testhelper: Expose helper to build Go executablesPatrick Steinhardt
The `buildBinary()` helper handles building of Go executables, where it makes sure to build each executable once only. This function is used to build a set of Gitaly executables which we expect to be present. But in fact, it's more generally useful to also build for example Go test servers. Make the function public such that it can be reused in other packages.
2021-10-08testhelper: Reject trying to build same binary multiple timesPatrick Steinhardt
We allow for the case where a test tries to build a binary with the same target path multiple times. This is bound to be problematic though: if we ever were to blindly parallelize these tests, then we'd potentially copy over the target path while it's still being executed, which would likely lead to ETXTBSY errors. It's furthermore an antipattern to reuse the same configuration for multiple test runs due to potentially shared state. Fix all tests which do this. Next to a few tests which accidentally build gitaly-hooks multiple times, the most common case is the "praefect" binary which gets built automatically whenever we run a new Gitaly server. Add a safeguard which fails the test as soon as it tries to build the same binary twice.
2021-10-08testhelper: Refactor building of binaries for improved readabilityPatrick Steinhardt
The testhelper function to build binaries is a bit hard to read. One of the sources is the variable names which point to the shared binary paths, which are rather generically named and thus don't tell what these really are about. And the other confusing part is the deferred function call, which isn't really necessary at all. Improve readability by renaming the variables and inlining the deferred call.
2021-10-08Merge branch 'ps-cmd-list-missing' into 'master'Pavlo Strokov
list-untracked-repositories: Praefect sub-command to show untracked repositories Closes #3792 See merge request gitlab-org/gitaly!3926
2021-10-08test: Database name extraction is not neededPavlo Strokov
Code simplified to use a field of the glsql.DB struct instead of extracting database name directly with SQL.
2021-10-08repocleaner: Improve test codePavlo Strokov
The test code is hard to read and understand. The change simplifies things and provides additional comments into the test code with explanations of what and why is done.
2021-10-08prometheus: Avoid duplicated metrics registrationPavlo Strokov
Praefect uses prometheus to export metrics from inside. It relies on the defaults from the prometheus library to gather set of metrics and register a new metrics. Because of it the new metrics got registered on the DefaultRegisterer - a global pre-configured registerer. Because of that we can't call 'run' function multiple times (for testing purposes) as it results to the metrics registration error. To omit that problem the 'run' function extended with prometheus.Registerer parameter that is used to register praefect custom metrics. The production code still uses the same DefaultRegisterer as it was before. And the test code creates a new instance of the registerer for each 'run' invocation, so there are no more duplicates.
2021-10-08bootstrap: Abstract bootstrapper for testingPavlo Strokov
The old implementation of the bootstrapper initialization does not allow calling the 'run' function to start a service because the tableflip library doesn't support multiple instances to be created for one process. Starting the Praefect service is required in tests to verify sub-command execution. The bootstrapper initialization extracted out of 'run' function. It allows using a new Noop bootstrapper to run service without tableflip support.
2021-10-08remove-repository: Reduce code duplicationPavlo Strokov
Let's reuse the repository creation function introduced with the list-untracked-repositories Praefect sub-command in a previous commit. This function uses the test name in the repository relative path. This can avoid possible side effects from one test to affect other tests.
2021-10-08list-untracked-repositories: New praefect sub-commandPavlo Strokov
Praefect extended with a new sub-command 'list-untracked-repositories'. It walks over all repositories of all accessible gitaly nodes and checks if repositories exist in the praefect database. If repository doesn't exist in the database the information about its location will be printed on STDOUT in JSON format terminated with a newline (it is configurable with --delimiter flag). STDERR is used to write log messages and errors if any. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3792 Changelog: added
2021-10-08list-untracked-repositories: Change callback of ExecOnRepositories to return ↵Pavlo Strokov
an error The ExecOnRepositories method accepts a callback function that is called once the batch is ready for the processing. The signature of the callback function changed to return an error. If the error occurs the ExecOnRepositories stops iteration over the repositories and returns an error back to the caller. That change is a preparation for the new 'list-untracked-repositories' sub-command.
2021-10-08list-untracked-repositories: Extract ExecOnRepositories method for re-usePavlo Strokov
The ExecOnRepositories method is extracted into a new Walker type. It will be re-used for the new 'list-untracked-repositories' sub-command.
2021-10-08testhelper: Simplify infra to build commands oncePatrick Steinhardt
In order to not have to repeatedly rebuild the same executables during test setups, the testhelper package provides a bunch of functions which build executables via `sync.Once` and then copy them into place. One shortcoming though is that we have a separate `sync.Once` for each of the binaries though, which is easy to get wrong. Refactor the code to use a `sync.Map` containing `sync.Once`s keyed by source path. Like this, we can create them ad-hoc in a raceless way and don't need to manage them globally.
2021-10-08Merge branch 'jc-fix-cache-test' into 'master'Patrick Steinhardt
Fail Read if objectReader is closed Closes #3823 See merge request gitlab-org/gitaly!3944
2021-10-08Reset buffered reader upon closeJohn Cai
A flaky test exposed the issue whereby an object reader can still be read from after it is closed. This is because we are using a buffered reader for reading the data from the command, so there could be some buffered data that is still readable even though the objectReader is closed. The fix is to check if the objectReader has been closed in the Read function, and if so to return an error immediately.
2021-10-07Merge branch 'ps-fix-flaky-cache-clean' into 'master'John Cai
streamcache: Fix flaky TestCache_diskCleanup Closes #3822 See merge request gitlab-org/gitaly!3942
2021-10-07streamcache: Fix flaky TestCache_diskCleanupPavlo Strokov
Because implementation was used single sleep function for both filestore cleanup and clean function invocation the close of the channel to trigger "single" invocation of the clean was broken. This is due to close will make channel to return default results immediately. But for the test we need to run it only once. We can't send two elements on the single channel as we can't be sure both won't be consumed by one of those functions. That is why we add separate sleep function for each of those functions. By sending on each channel we are sure each function will be triggered exactly once.
2021-10-07Merge branch 'pks-testhelper-goroutine-leakage' into 'master'Pavlo Strokov
testhelper: Enable Goroutine leak checks by default Closes #3790 See merge request gitlab-org/gitaly!3909
2021-10-07Merge branch 'ps-fix-migration' into 'master'Toon Claes
sql-migrate: Update storage_repositories table Closes #3806 See merge request gitlab-org/gitaly!3927
2021-10-07testhelper: Enable Goroutine leak checks by defaultPatrick Steinhardt
In a recent incident, a refactoring of the catfile cache has caused an incident due to leaking Goroutines. While we did already check for leaking Goroutines in a subset of our test suite before via the goleak check, this was opt-in the same way as `mustHaveNoChildProcess()` was. It also was used a lot less often given that we did have hundreds of actual Goroutine leaks in our test suite. Introduce leak checking by default now that almost all of the Goroutine leaks have been fixed. There is only two packages remaining which still leak Goroutines: the catfile package, which will get fixed via an MR which is currently in flight. And the backup package, which is leaking Goroutines in a package external to us. To the best of my knowledge, there is no way to plug those. In any case, having on-by-default leak checking should put us in a much better position to detect such regressions in the future.
2021-10-07global: Close gRPC connectionsPatrick Steinhardt
When creating gRPC connections, then we spawn a set of Goroutines which listen on these connections. As a result, if they are never closed, those Goroutines are leaked. Fix this by closing connections.