Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSami Hiltunen <shiltunen@gitlab.com>2021-11-17 16:17:22 +0300
committerSami Hiltunen <shiltunen@gitlab.com>2021-11-17 17:05:51 +0300
commit6d569bb696187268b086fecdb5d45440eda7ebc2 (patch)
tree43f95550377b0b7d79d01d80f32f2f910c22db4e
parent00071e4ab87eaae7c4f68705613191046cd023be (diff)
Materialize valid_primaries view in dataloss query
The dataloss query is extremely slow for bigger datasets. The problem is that for each row that the data loss query is returning, Postgres computes the full result of the valid_primaries view only to filter down to the correct record. This results in an o(n2) complexity which kills the performance as soon as the dataset size increases. It's not clear why the join parameters are not pushed down in to the view in the query. This commit optimizes the query by materializing the valid_primaries view. This ensures Postgres computes the full view only once and joins with the pre-computed result. Changelog: performance
-rw-r--r--internal/praefect/datastore/repository_store.go4
1 files changed, 4 insertions, 0 deletions
diff --git a/internal/praefect/datastore/repository_store.go b/internal/praefect/datastore/repository_store.go
index 9c8dcf30e..270c0b278 100644
--- a/internal/praefect/datastore/repository_store.go
+++ b/internal/praefect/datastore/repository_store.go
@@ -651,6 +651,10 @@ func (rs *PostgresRepositoryStore) GetPartiallyAvailableRepositories(ctx context
// than the assigned ones.
//
rows, err := rs.db.QueryContext(ctx, `
+WITH valid_primaries AS MATERIALIZED (
+ SELECT * FROM valid_primaries
+)
+
SELECT
json_build_object (
'RelativePath', relative_path,