Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJustin Tobler <jtobler@gitlab.com>2022-10-12 23:20:20 +0300
committerJustin Tobler <jtobler@gitlab.com>2022-10-13 02:01:34 +0300
commit714e7c99b0cb930ba08ecd0660834144cebeeabe (patch)
tree52f261e5ada750e236ecb765f2f26caf3ceac772
parentc4efe30d38cc916682aa405228a06c3d7c91caff (diff)
Praefect: Update voter state on failed node RPCjt-praefect-transaction-error-handler
Currently when a secondary node RPC fails the error is ignored and the transaction continues waiting for additional votes even if there are not enough outstanding votes to reach quorum. If quorum becomes impossible, due to the failed node, the transaction hangs until the context gets canceled. This is not desirable as ideally once it has been established that there is not enough outstanding votes to reach the required threshold specified by the transaction, the transaction should be canceled. This change adapts the `ErrHandler` function of the secondary nodes to cancel the voter in the transaction associated with the failed RPC. In this process the voter's result state is updated to `VoteCanceled` and the subtransaction is checked to see if quorum can still be achieved. If quorum is impossible the vote is failed and the voters are unblocked. If quorum is still possible the voters remain blocked waiting for further votes to decide the outcome.
-rw-r--r--internal/praefect/coordinator.go15
1 files changed, 11 insertions, 4 deletions
diff --git a/internal/praefect/coordinator.go b/internal/praefect/coordinator.go
index e000c24cf..680ba7eb5 100644
--- a/internal/praefect/coordinator.go
+++ b/internal/praefect/coordinator.go
@@ -467,10 +467,17 @@ func (c *Coordinator) mutatorStreamParameters(ctx context.Context, call grpcCall
ctxlogrus.Extract(ctx).WithError(err).
Error("proxying to secondary failed")
- // For now, any errors returned by secondaries are ignored.
- // This is mostly so that we do not abort transactions which
- // are ongoing and may succeed even with a subset of
- // secondaries bailing out.
+ // Cancels failed node's voter in its current subtransaction.
+ // Also updates internal state of subtransaction to fail and
+ // release blocked voters if quorum becomes impossible.
+ if err := c.txMgr.CancelTransactionNodeVoter(transaction.ID(), secondary.Storage); err != nil {
+ ctxlogrus.Extract(ctx).WithError(err).
+ Error("canceling secondary voter failed")
+ }
+
+ // The error is ignored, so we do not abort transactions
+ // which are ongoing and may succeed even with a subset
+ // of secondaries bailing out.
return nil
},
})