Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/cmd
diff options
context:
space:
mode:
authorSami Hiltunen <shiltunen@gitlab.com>2021-09-10 10:09:42 +0300
committerSami Hiltunen <shiltunen@gitlab.com>2021-10-12 11:19:54 +0300
commitd82c25045eb43700063f6b2c3e2967af3778835c (patch)
treef8eaf3b73780b3def194ad72a3f8b9460575b72a /cmd
parent03de94f491d7e01e161294e159a5bc3c2f84d9eb (diff)
Remove replica records when deleting a repository
Prafect currently identifies stale replicas of deleted repositories by the replicas having entries in 'storage_repositories' when the there's no corresponding entry for the repository in the 'repositories' table. Prior to having database records for every repository with the 'per_repository' election strategy, this was necessary to differentiate replicas of deleted repositories from repositories without database records. This then allowed Praefect to retry deleting stale replicas by rescheduling the deletion jobs in the reconciler. Nowadays, Praefect expects to have database records for every repository that should be present on the cluster. Given that, there's no problem differentiating repositories without database records from stale replicas of deleted repositories anymore. If a repository doesn't have a database entry, it's eligible for deletion. There can also be stale repositories on the disks of the Gitalys due to failed operations. For example, repository creation could succeed far enough to create the repository but fail to acknowledge the creation to Praefect. Such repository would not be recorded as existing, and should be deleted from the disks. In these cases, Praefect wouldn't have database records for the stale replicas. These both cases can be handled by a clean up crawler that is being implemented in Praefect. The crawler will walk the disks of the Gitaly nodes and identify repositories which do not have database records. This allows us to simplify our database model. When a repository is deleted, all of its replica records can be deleted in the same transaction. This commit implements this by cascading the delete from 'repositories' table to the 'storage_repositories' table. 'DeleteRepository' is updated to delete just the records in 'repositories' which then cascades to the 'storage_repositories' table. The method is updated to return the path where the replicas are stored and the storages which are known to hold a replica. This allows a later commit to remove replicas from the storages immediately so the cluster remains clean even between the clean up crawlers runs. The schema still allows orphaned records in 'storage_repositories'. In a later release, we can remove the remaining orphaned records and enforce all 'storage_repositories' records have a repository_id set.
Diffstat (limited to 'cmd')
-rw-r--r--cmd/praefect/subcmd_remove_repository_test.go5
1 files changed, 2 insertions, 3 deletions
diff --git a/cmd/praefect/subcmd_remove_repository_test.go b/cmd/praefect/subcmd_remove_repository_test.go
index 1953c90a0..519ab4764 100644
--- a/cmd/praefect/subcmd_remove_repository_test.go
+++ b/cmd/praefect/subcmd_remove_repository_test.go
@@ -144,9 +144,8 @@ func TestRemoveRepository_Exec(t *testing.T) {
t.Run("no info about repository on praefect", func(t *testing.T) {
repo := createRepo(t, ctx, repoClient, praefectStorage, t.Name())
repoStore := datastore.NewPostgresRepositoryStore(db.DB, nil)
- require.NoError(t, repoStore.DeleteRepository(
- ctx, repo.StorageName, repo.RelativePath, []string{g1Cfg.Storages[0].Name, g2Cfg.Storages[0].Name},
- ))
+ _, _, err := repoStore.DeleteRepository(ctx, repo.StorageName, repo.RelativePath)
+ require.NoError(t, err)
logger := testhelper.NewTestLogger(t)
loggerHook := test.NewLocal(logger)