Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSami Hiltunen <shiltunen@gitlab.com>2021-11-17 16:17:22 +0300
committerSami Hiltunen <shiltunen@gitlab.com>2021-11-17 16:17:22 +0300
commitf505f80f3548d8e6e06f97125a18fb49a41b848f (patch)
treed101981d77546f487bef5b1898226ab1b304efa3
parentaa413c7afbc868f21c0c97da9d46ac040dfdc4c6 (diff)
Materialize valid_primaries view in dataloss query
The dataloss query is extremely slow for bigger datasets. The problem is that for each row that the data loss query is returning, Postgres computes the full result of the valid_primaries view only to filter down to the correct record. This results in an o(n2) complexity which kills the performance as soon as the dataset size increases. It's not clear why the join parameters are not pushed down in to the view in the query. This commit optimizes the query by materializing the valid_primaries view. This ensures Postgres computes the full view only once and joins with the pre-computed result. Changelog: performance
-rw-r--r--internal/praefect/datastore/repository_store.go4
1 files changed, 4 insertions, 0 deletions
diff --git a/internal/praefect/datastore/repository_store.go b/internal/praefect/datastore/repository_store.go
index 278f4e3ea..48d4814bc 100644
--- a/internal/praefect/datastore/repository_store.go
+++ b/internal/praefect/datastore/repository_store.go
@@ -589,6 +589,10 @@ func (rs *PostgresRepositoryStore) GetPartiallyAvailableRepositories(ctx context
// than the assigned ones.
//
rows, err := rs.db.QueryContext(ctx, `
+WITH valid_primaries AS MATERIALIZED (
+ SELECT * FROM valid_primaries
+)
+
SELECT
json_build_object (
'RelativePath', relative_path,