Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2021-08-18 12:10:26 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2021-08-18 12:10:26 +0300
commit3b4c0d27d5ad32fecdcc95e86bf919fc13830c5b (patch)
tree98001b846bd52e10a3aa0fd9adff82b19ae89847 /doc/architecture
parent514ace363222f19595375f59b123b5e27c2b9b8a (diff)
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc/architecture')
-rw-r--r--doc/architecture/blueprints/database/scalability/patterns/read_mostly.md2
-rw-r--r--doc/architecture/blueprints/database/scalability/patterns/time_decay.md2
-rw-r--r--doc/architecture/blueprints/database_scaling/size-limits.md4
3 files changed, 4 insertions, 4 deletions
diff --git a/doc/architecture/blueprints/database/scalability/patterns/read_mostly.md b/doc/architecture/blueprints/database/scalability/patterns/read_mostly.md
index bd87573a88e..02b56841507 100644
--- a/doc/architecture/blueprints/database/scalability/patterns/read_mostly.md
+++ b/doc/architecture/blueprints/database/scalability/patterns/read_mostly.md
@@ -1,6 +1,6 @@
---
stage: Enablement
-group: database
+group: Database
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
comments: false
description: 'Learn how to scale operating on read-mostly data at scale'
diff --git a/doc/architecture/blueprints/database/scalability/patterns/time_decay.md b/doc/architecture/blueprints/database/scalability/patterns/time_decay.md
index 6e0187a8d74..9309c581d54 100644
--- a/doc/architecture/blueprints/database/scalability/patterns/time_decay.md
+++ b/doc/architecture/blueprints/database/scalability/patterns/time_decay.md
@@ -1,6 +1,6 @@
---
stage: Enablement
-group: database
+group: Database
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
comments: false
description: 'Learn how to operate on large time-decay data'
diff --git a/doc/architecture/blueprints/database_scaling/size-limits.md b/doc/architecture/blueprints/database_scaling/size-limits.md
index 107cf0bb248..d63aa3bd4e8 100644
--- a/doc/architecture/blueprints/database_scaling/size-limits.md
+++ b/doc/architecture/blueprints/database_scaling/size-limits.md
@@ -33,7 +33,7 @@ graph LR
Large tables on GitLab.com are a major problem - for both operations and development. They cause a variety of problems:
1. **Query timings** and hence overall application performance suffers
-1. **Table maintenance** becomes much more costly. Vacuum activity has become a significant concern on GitLab.com - with large tables only seeing infrequent (e.g. once per day) and vacuum runs taking many hours to complete. This has various negative consequences and a very large table has potential to impact seemingly unrelated parts of the database and hence overall application performance suffers.
+1. **Table maintenance** becomes much more costly. Vacuum activity has become a significant concern on GitLab.com - with large tables only seeing infrequent (once per day) processing and vacuum runs taking many hours to complete. This has various negative consequences and a very large table has potential to impact seemingly unrelated parts of the database and hence overall application performance suffers.
1. **Data migrations** on large tables are significantly more complex to implement and incur development overhead. They have potential to cause stability problems on GitLab.com and take a long time to execute on large datasets.
1. **Indexes size** is significant. This directly impacts performance as smaller parts of the index are kept in memory and also makes the indexes harder to maintain (think repacking).
1. **Index creation times** go up significantly - in 2021, we see btree creation take up to 6 hours for a single btree index. This impacts our ability to deploy frequently and leads to vacuum-related problems (delayed cleanup).
@@ -141,7 +141,7 @@ There is no standard solution to reduce table sizes - there are many!
1. **Partitioning**: Apply a partitioning scheme if there is a common access dimension.
1. **Normalization**: Review relational modeling and apply normalization techniques to remove duplicate data
1. **Vertical table splits**: Review column usage and split table vertically.
-1. **Externalize**: Move large data types out of the database entirely. For example, JSON documents, especially when not used for filtering, may be better stored outside the database, e.g. in object storage.
+1. **Externalize**: Move large data types out of the database entirely. For example, JSON documents, especially when not used for filtering, may be better stored outside the database, for example, in object storage.
NOTE:
While we're targeting to limit physical table sizes, we consider retaining or improving performance a goal, too.