Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/namespaces_storage_statistics.md')
-rw-r--r--doc/development/namespaces_storage_statistics.md18
1 files changed, 9 insertions, 9 deletions
diff --git a/doc/development/namespaces_storage_statistics.md b/doc/development/namespaces_storage_statistics.md
index 2c7e5935435..f8aea10097d 100644
--- a/doc/development/namespaces_storage_statistics.md
+++ b/doc/development/namespaces_storage_statistics.md
@@ -14,18 +14,18 @@ storage consumed by a group, and allow easy management.
## Problem
In GitLab, we update the project storage statistics through a
-[callback](https://gitlab.com/gitlab-org/gitlab-ce/blob/v12.2.0.pre/app/models/project.rb#L90)
+[callback](https://gitlab.com/gitlab-org/gitlab-foss/blob/v12.2.0.pre/app/models/project.rb#L90)
every time the project is saved.
The summary of those statistics per namespace is then retrieved
-by [`Namespaces#with_statistics`](https://gitlab.com/gitlab-org/gitlab-ce/blob/v12.2.0.pre/app/models/namespace.rb#L70) scope. Analyzing this query we noticed that:
+by [`Namespaces#with_statistics`](https://gitlab.com/gitlab-org/gitlab-foss/blob/v12.2.0.pre/app/models/namespace.rb#L70) scope. Analyzing this query we noticed that:
- It takes up to `1.2` seconds for namespaces with over `15k` projects.
- It can't be analyzed with [ChatOps](chatops_on_gitlabcom.md), as it times out.
Additionally, the pattern that is currently used to update the project statistics
(the callback) doesn't scale adequately. It is currently one of the largest
-[database queries transactions on production](https://gitlab.com/gitlab-org/gitlab-ce/issues/62488)
+[database queries transactions on production](https://gitlab.com/gitlab-org/gitlab-foss/issues/62488)
that takes the most time overall. We can't add one more query to it as
it will increase the transaction's length.
@@ -131,7 +131,7 @@ WHERE namespace_id IN (
Even though this approach would make aggregating much easier, it has some major downsides:
-- We'd have to migrate **all namespaces** by adding and filling a new column. Because of the size of the table, dealing with time/cost will not be great. The background migration will take approximately `153h`, see <https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/29772>.
+- We'd have to migrate **all namespaces** by adding and filling a new column. Because of the size of the table, dealing with time/cost will not be great. The background migration will take approximately `153h`, see <https://gitlab.com/gitlab-org/gitlab-foss/merge_requests/29772>.
- Background migration has to be shipped one release before, delaying the functionality by another milestone.
### Attempt E (final): Update the namespace storage statistics in async way
@@ -142,7 +142,7 @@ but we refresh them through Sidekiq jobs and in different transactions:
1. Create a second table (`namespace_aggregation_schedules`) with two columns `id` and `namespace_id`.
1. Whenever the statistics of a project changes, insert a row into `namespace_aggregation_schedules`
- We don't insert a new row if there's already one related to the root namespace.
- - Keeping in mind the length of the transaction that involves updating `project_statistics`(<https://gitlab.com/gitlab-org/gitlab-ce/issues/62488>), the insertion should be done in a different transaction and through a Sidekiq Job.
+ - Keeping in mind the length of the transaction that involves updating `project_statistics`(<https://gitlab.com/gitlab-org/gitlab-foss/issues/62488>), the insertion should be done in a different transaction and through a Sidekiq Job.
1. After inserting the row, we schedule another worker to be executed async at two different moments:
- One enqueued for immediate execution and another one scheduled in `1.5h` hours.
- We only schedule the jobs, if we can obtain a `1.5h` lease on Redis on a key based on the root namespace ID.
@@ -162,7 +162,7 @@ This implementation has the following benefits:
The only downside of this approach is that namespaces' statistics are updated up to `1.5` hours after the change is done,
which means there's a time window in which the statistics are inaccurate. Because we're still not
-[enforcing storage limits](https://gitlab.com/gitlab-org/gitlab-ce/issues/30421), this is not a major problem.
+[enforcing storage limits](https://gitlab.com/gitlab-org/gitlab-foss/issues/30421), this is not a major problem.
## Conclusion
@@ -171,8 +171,8 @@ performant approach of aggregating the root namespaces.
All the details regarding this use case can be found on:
-- <https://gitlab.com/gitlab-org/gitlab-ce/issues/62214>
-- Merge Request with the implementation: <https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996>
+- <https://gitlab.com/gitlab-org/gitlab-foss/issues/62214>
+- Merge Request with the implementation: <https://gitlab.com/gitlab-org/gitlab-foss/merge_requests/28996>
Performance of the namespace storage statistics were measured in staging and production (GitLab.com). All results were posted
-on <https://gitlab.com/gitlab-org/gitlab-ce/issues/64092>: No problem has been reported so far.
+on <https://gitlab.com/gitlab-org/gitlab-foss/issues/64092>: No problem has been reported so far.