diff options
author | Patrick Steinhardt <psteinhardt@gitlab.com> | 2021-03-02 12:03:32 +0300 |
---|---|---|
committer | Patrick Steinhardt <psteinhardt@gitlab.com> | 2021-03-03 15:22:24 +0300 |
commit | d668ac3a39fa5d517bec2c51cc3f356f66ad9d78 (patch) | |
tree | f3c88d627a539312de518cf8228d504fdd4f26da /internal/praefect/coordinator_test.go | |
parent | 901764b1c11973ce303879c6d28fd5de5299280f (diff) |
coordinator: Allow force-routing specific accessors to primary
Until now, we've been completely agnostic about which accessor RPC we're
routing: they simply all use reads distribution. This works just fine
for most of the RPCs, but leads to inconsistent results for some others
where it's really hard to fix.
Two such RPCs are `RepositorySize` and `GetObjectDiretcorySize`: both of
them depend on the on-disk state of the repository. This is nothing we
have full control of: first, we do not assure that replicas always get
packed at the same point in time. And second, even if we did, we cannot
guarantee that git always ends up with the same set of packfiles. As a
result, the likelihood that replicas would report different sizes is
exceedingly high.
We could probably solve this by implementing some custom logic which
simply routes the RPC to all replicas and then takes the mean of all
returned sizes. But this is needlessly complex, and would require us to
put a lot of RPC-specific behaviour into Praefect. This commit instead
introduces a best-effort strategy of simply always routing both RPCs to
the primary. This may still lead to discrepancies when the primary node
is flapping. But that shouldn't happen too frequently, and when it does
we probably have other problems than repo sizes.
Diffstat (limited to 'internal/praefect/coordinator_test.go')
-rw-r--r-- | internal/praefect/coordinator_test.go | 49 |
1 files changed, 49 insertions, 0 deletions
diff --git a/internal/praefect/coordinator_test.go b/internal/praefect/coordinator_test.go index 5ab5cfe03..5cd7d218b 100644 --- a/internal/praefect/coordinator_test.go +++ b/internal/praefect/coordinator_test.go @@ -578,6 +578,55 @@ func TestCoordinatorStreamDirector_distributesReads(t *testing.T) { require.Zero(t, secondaryChosen, "secondary should never have been chosen") }) + t.Run("forwards accessor to primary for primary-only RPCs", func(t *testing.T) { + var primaryChosen int + var secondaryChosen int + + for i := 0; i < 16; i++ { + frame, err := proto.Marshal(&gitalypb.GetObjectDirectorySizeRequest{Repository: &targetRepo}) + require.NoError(t, err) + + fullMethod := "/gitaly.RepositoryService/GetObjectDirectorySize" + + peeker := &mockPeeker{frame: frame} + + ctx, cancel := testhelper.Context() + defer cancel() + + streamParams, err := coordinator.StreamDirector(ctx, fullMethod, peeker) + require.NoError(t, err) + require.Contains(t, []string{primaryNodeConf.Address, secondaryNodeConf.Address}, streamParams.Primary().Conn.Target(), "must be redirected to primary or secondary") + + var nodeConf config.Node + switch streamParams.Primary().Conn.Target() { + case primaryNodeConf.Address: + nodeConf = primaryNodeConf + primaryChosen++ + case secondaryNodeConf.Address: + nodeConf = secondaryNodeConf + secondaryChosen++ + } + + md, ok := metadata.FromOutgoingContext(streamParams.Primary().Ctx) + require.True(t, ok) + require.Contains(t, md, praefect_metadata.PraefectMetadataKey) + + mi, err := coordinator.registry.LookupMethod(fullMethod) + require.NoError(t, err) + require.Equal(t, protoregistry.OpAccessor, mi.Operation, "method must be an accessor") + + m, err := protoMessage(mi, streamParams.Primary().Msg) + require.NoError(t, err) + + rewrittenTargetRepo, err := mi.TargetRepo(m) + require.NoError(t, err) + require.Equal(t, nodeConf.Storage, rewrittenTargetRepo.GetStorageName(), "stream director must rewrite the storage name") + } + + require.Equal(t, 16, primaryChosen, "primary should have always been chosen") + require.Zero(t, secondaryChosen, "secondary should never have been chosen") + }) + t.Run("forwards accessor operations only to healthy nodes", func(t *testing.T) { healthSrv.SetServingStatus("", grpc_health_v1.HealthCheckResponse_NOT_SERVING) |