diff options
author | Sami Hiltunen <shiltunen@gitlab.com> | 2023-05-24 15:23:18 +0300 |
---|---|---|
committer | Sami Hiltunen <shiltunen@gitlab.com> | 2023-05-24 15:38:46 +0300 |
commit | e9a888be7035afa5d5854110801cdb42b1411e3f (patch) | |
tree | c28a2abf308b9658628e4d698f6f78fc0bb23cf1 /internal/gitaly/transaction_manager.go | |
parent | f7c4eb67723463626eb51176da743359cefe9012 (diff) |
Ensure TransactionManager has applied all log entries before assertions
There's a race currently in the TransactionManager tests where we're
not always guaranteed to have all state applied on the disk before
asserting it if the repository is being deleted by a concurrent
transaction.
Generally this is guaranteed due to transactions being processed one
by one. The `checkManagerError` is queuing a failing transaction to
see if the manager picks it for processing. When it does, the manager
would have already applied all preceding log entries.
Repository deletions are different as we can't delete the repository
before all transactions with a pre-deletion snapshot are done as
otherwise we'd delete the data they access. To avoid a dead lock where
the manager is waiting for the transactions to finish and the transactions
are waiting for the manager to admit them while it is stuck applying the
deletion, the applyRepositoryDeletion is actively rejecting transactions
that are waiting to commit when it is applying a deletion. Once all have
been rejected, the deletion is applied.
The above logic is the source of the race in the tests. The test transaction
queued in `checkManagerError` is rejected before the deletion is applied.
This unblocks the main test goroutine which then proceeds to run the
assertions on the disk state before the deletion is fully applied.
Fix this race by always beginning a new transaction if the manager is
running in `checkManagerError`. Transaction's must wait for all committed
data to be applied before they can begin, beginning and rolling back a
transaction works to ensure all committed data has been applied.
Transactions beginning in a non-existent repository were previously early
exiting without waiting for the logged data to be applied. This was fine
as if the repository doesn't exist, there's no data to be applied and
returning a not found error early was fine. This commit removes the special
casing so we also wait for the deletion to be applied before returning from
Begin. This leads to no external behavior change but allows us to ensure
the log has been applied by starting a transaction.
Diffstat (limited to 'internal/gitaly/transaction_manager.go')
-rw-r--r-- | internal/gitaly/transaction_manager.go | 9 |
1 files changed, 5 insertions, 4 deletions
diff --git a/internal/gitaly/transaction_manager.go b/internal/gitaly/transaction_manager.go index 4e2060dcf..5862cd5d2 100644 --- a/internal/gitaly/transaction_manager.go +++ b/internal/gitaly/transaction_manager.go @@ -199,10 +199,6 @@ func (mgr *TransactionManager) Begin(ctx context.Context) (_ *Transaction, retur } mgr.mutex.Lock() - if !mgr.repositoryExists { - mgr.mutex.Unlock() - return nil, ErrRepositoryNotFound - } txn := &Transaction{ commit: mgr.commit, @@ -224,6 +220,7 @@ func (mgr *TransactionManager) Begin(ctx context.Context) (_ *Transaction, retur openTransactionElement := mgr.openTransactions.PushBack(txn) readReady := mgr.applyNotifications[txn.snapshot.ReadIndex] + repositoryExists := mgr.repositoryExists mgr.mutex.Unlock() if readReady == nil { // The snapshot log entry is already applied if there is no notification channel for it. @@ -272,6 +269,10 @@ func (mgr *TransactionManager) Begin(ctx context.Context) (_ *Transaction, retur case <-mgr.ctx.Done(): return nil, ErrTransactionProcessingStopped case <-readReady: + if !repositoryExists { + return nil, ErrRepositoryNotFound + } + return txn, nil } } |