Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPatrick Steinhardt <psteinhardt@gitlab.com>2022-06-08 16:33:12 +0300
committerPatrick Steinhardt <psteinhardt@gitlab.com>2022-06-08 16:50:59 +0300
commit8e94bb2baee1634525a4648342a9d20efb4ca608 (patch)
treeb195276aef4837fe4966ca22357920865e8aca2c
parentf099614e635d05483055ba6fbebc74d961bf2ce5 (diff)
datastore: Fix migration that prunes maintenance-style replication jobspks-praefect-fix-remove-maintenance-jobs-migration
We 41ca8534e (praefect: Prune stale maintenance-style replication events, 2022-05-20), we have introduced a migration that prunes old maintenance-style replication jobs from our replication queue. That migration was a bit too naive though because it assumed that we had mostly drained the replication queue of such jobs already and that there aren't any stale locks which referenced these jobs. Naturally, the real world shows that such stale locks exist. And because these locks have a foreign-key constraint on the job without cascading deletes we fail to prune any such maintenance-style job that is referenced by a lock. Fix this issue by deleting job locks before deleting the jobs. Also, let's handle the case where there are acquired job locks that should be unlocked because all job are removed. Changelog: fixed
-rw-r--r--internal/praefect/datastore/migrations/20220520083313_remove_maintenance_replication_events.go63
1 files changed, 52 insertions, 11 deletions
diff --git a/internal/praefect/datastore/migrations/20220520083313_remove_maintenance_replication_events.go b/internal/praefect/datastore/migrations/20220520083313_remove_maintenance_replication_events.go
index 09330c942..21a95dd49 100644
--- a/internal/praefect/datastore/migrations/20220520083313_remove_maintenance_replication_events.go
+++ b/internal/praefect/datastore/migrations/20220520083313_remove_maintenance_replication_events.go
@@ -6,17 +6,58 @@ func init() {
m := &migrate.Migration{
Id: "20220520083313_remove_maintenance_replication_events",
Up: []string{
- `DELETE FROM replication_queue WHERE job->>'change' IN (
- 'gc',
- 'repack_full',
- 'repack_incremental',
- 'cleanup',
- 'pack_refs',
- 'write_commit_graph',
- 'midx_repack',
- 'optimize_repository',
- 'prune_unreachable_objects'
- )`,
+ `
+-- Find all jobs which are maintenance-style jobs first.
+WITH maintenance_job AS (
+ SELECT id FROM replication_queue WHERE job->>'change' IN (
+ 'gc',
+ 'repack_full',
+ 'repack_incremental',
+ 'cleanup',
+ 'pack_refs',
+ 'write_commit_graph',
+ 'midx_repack',
+ 'optimize_repository',
+ 'prune_unreachable_objects'
+ )
+),
+
+-- Now we have to prune the job locks before deleting the maintenance job
+-- itself because the lock has a reference on the job.
+deleted_maintenance_job_lock AS (
+ DELETE FROM replication_queue_job_lock
+ WHERE job_id IN (SELECT id FROM maintenance_job)
+ RETURNING lock_id
+),
+
+-- With job locks having been deleted we can now delete the maintenance jobs.
+deleted_maintenance_job AS (
+ DELETE FROM replication_queue
+ WHERE id IN (SELECT id FROM maintenance_job)
+)
+
+-- Finally, we need to release replication queue locks in case we have removed
+-- all jobs which kept the lock.
+UPDATE replication_queue_lock
+SET acquired = FALSE
+WHERE id IN (
+ SELECT existing.lock_id
+ -- We do so by unlocking all locks where the count of deleted job locks is
+ -- the same as the count of existing job locks. If there happen to be any
+ -- other jobs we haven't deleted then the lock stays acquired. This is the
+ -- same logic as used in AcknowledgeStale.
+ FROM (
+ SELECT lock_id, COUNT(*) AS amount
+ FROM deleted_maintenance_job_lock
+ GROUP BY lock_id
+ ) AS removed
+ JOIN (
+ SELECT lock_id, COUNT(*) AS amount
+ FROM replication_queue_job_lock
+ WHERE lock_id IN (select lock_id FROM deleted_maintenance_job_lock)
+ GROUP BY lock_id
+ ) AS existing ON removed.lock_id = existing.lock_id AND removed.amount = existing.amount
+)`,
},
Down: []string{
// We cannot get this data back anymore.