Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPatrick Steinhardt <psteinhardt@gitlab.com>2021-01-18 12:54:14 +0300
committerPatrick Steinhardt <psteinhardt@gitlab.com>2021-01-18 13:20:01 +0300
commitd023af5566b798c227114c14f1012a4beb868a9c (patch)
treea8edc35e20cbbfb77d54f7ad14fe73e0f4620900
parentcd606a2951187043282701036135cd72193f88d3 (diff)
repository: Fix ReplicateRepository returning before RPCs have finished
When calling `ReplicateRepository()`, we try to both synchronize the gitattributes file as well as the actual repository contents. This is being done in parallel via the `errgroup` package, which allows us to run multiple functions in parallel and then retrieve the first error that was returned by any of them. One feature that is not obvious is that `errgroup.WithContext()` returns a cancellable context which gets cancelled as soon as any of the invoked functions returns an error. We're using this feature right now, which means that if either of the synchronizing functions fails, the other one will get cancelled. While it sounds like a logical thing to do, it does have problems in combination with the gRPC calls. If the context of a gRPC call gets cancelled, then the call will get aborted. But while there is no official documentation proving this, it seems like cancellation happens asynchronously. So while we're cancelling the remote call, we'll return even though the remote side hasn't yet been cancelled. Which effectively means that it may still be running on the remote server after `ReplicateRepository()` has returned on the local client. There doesn't seem to be any way to synchronize on the remote cancellation, so what this commit instead does is to simply not pass the cancellable context of the `errgroup` to those functions. This has the downside of not cancelling RPCs even if we know that they'll fail, or that we don't care for their result. But on the other hand, it fixes above race where we may modify the repository even after the RPC has returned. This also fixes a test race in "TestReplicate/BadRepository".
-rw-r--r--changelogs/unreleased/pks-replicate-repository-cancellation.yml5
-rw-r--r--internal/gitaly/service/repository/replicate.go8
2 files changed, 12 insertions, 1 deletions
diff --git a/changelogs/unreleased/pks-replicate-repository-cancellation.yml b/changelogs/unreleased/pks-replicate-repository-cancellation.yml
new file mode 100644
index 000000000..be729fd24
--- /dev/null
+++ b/changelogs/unreleased/pks-replicate-repository-cancellation.yml
@@ -0,0 +1,5 @@
+---
+title: 'repository: Fix ReplicateRepository returning before RPCs have finished'
+merge_request: 3011
+author:
+type: fixed
diff --git a/internal/gitaly/service/repository/replicate.go b/internal/gitaly/service/repository/replicate.go
index fd0b09de2..20d3ffa74 100644
--- a/internal/gitaly/service/repository/replicate.go
+++ b/internal/gitaly/service/repository/replicate.go
@@ -47,7 +47,13 @@ func (s *server) ReplicateRepository(ctx context.Context, in *gitalypb.Replicate
}
}
- g, ctx := errgroup.WithContext(ctx)
+ // We're not using the context of the errgroup here, as an error
+ // returned by either of the called functions would cancel the
+ // respective other function. Given that we're doing RPC calls in
+ // them, cancellation of the calls would mean that the remote side
+ // may still modify the repository even though the local side has
+ // returned already.
+ g, _ := errgroup.WithContext(ctx)
outgoingCtx := helper.IncomingToOutgoing(ctx)
syncFuncs := []func(context.Context, *gitalypb.ReplicateRepositoryRequest) error{