Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2022-11-23 03:08:19 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2022-11-23 03:08:19 +0300
commit24aaf5f076a73261e242a8f82e90bfb645712597 (patch)
treea35f6153ddfea85e3d59a32596b15ae6d50f85bb /doc
parent18869e31e629f7897451f26800f9123fa412f956 (diff)
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc')
-rw-r--r--doc/architecture/blueprints/pods/index.md2
-rw-r--r--doc/architecture/blueprints/pods/pods-feature-data-migration.md48
-rw-r--r--doc/architecture/blueprints/pods/pods-feature-global-search.md47
-rw-r--r--doc/architecture/blueprints/pods/pods-feature-schema-changes.md55
-rw-r--r--doc/user/clusters/agent/install/index.md2
-rw-r--r--doc/user/packages/terraform_module_registry/index.md6
6 files changed, 156 insertions, 4 deletions
diff --git a/doc/architecture/blueprints/pods/index.md b/doc/architecture/blueprints/pods/index.md
index 3ba319d169b..e06072cbe74 100644
--- a/doc/architecture/blueprints/pods/index.md
+++ b/doc/architecture/blueprints/pods/index.md
@@ -324,6 +324,8 @@ This is the list of known affected features with the proposed solutions.
- [Pods: GraphQL](pods-feature-graphql.md)
- [Pods: Organizations](pods-feature-organizations.md)
- [Pods: Router Endpoints Classification](pods-feature-router-endpoints-classification.md)
+- [Pods: Schema changes (Postgres and Elasticsearch migrations)](pods-feature-schema-changes.md)
+- [Pods: Global Search](pods-feature-global-search.md)
## Links
diff --git a/doc/architecture/blueprints/pods/pods-feature-data-migration.md b/doc/architecture/blueprints/pods/pods-feature-data-migration.md
index fad6bca45fa..edd3e24a7ba 100644
--- a/doc/architecture/blueprints/pods/pods-feature-data-migration.md
+++ b/doc/architecture/blueprints/pods/pods-feature-data-migration.md
@@ -26,6 +26,24 @@ we can document the reasons for not choosing this approach.
It is essential for Pods architecture to provide a way to migrate data out of big Pods
into smaller ones. This describes various approaches to provide this type of split.
+We also need to handle for cases where data is already violating the expected
+isolation constraints of Pods (ie. references cannot span multiple
+organizations). We know that existing features like linked issues allowed users
+to link issues across any projects regardless of their hierachy. There are many
+similar features. All of this data will need to be migrated in some way before
+it can be split across different pods. This may mean some data needs to be
+deleted, or the feature changed and modelled slightly differently before we can
+properly split or migrate the organizations between pods.
+
+Having schema deviations across different Pods, which is a necessary
+consequence of different databases, will also impact our ability to migrate
+data between pods. Different schemas impact our ability to reliably replicate
+data across pods and especially impact our ability to validate that the data is
+correctly replicated. It might force us to only be able to move data between
+pods when the schemas are all in sync (slowing down deployments and the
+rebalancing process) or possibly only migrate from newer to older schemas which
+would be complex.
+
## 1. Definition
## 2. Data flow
@@ -55,6 +73,28 @@ physical replication, etc.
1. All accesses to `gitlab-org` on a given Pod are validated about `pod_id` of `routes`
to ensure that given Pod is authoritative to handle the data.
+#### More challenges of this proposal
+
+1. There is no streaming replication capability for Elasticsearch, but you could
+ snapshot the whole Elasticsearch index and recreate, but this takes hours.
+ It could be handled by pausing Elasticsearch indexing on the initial pod during
+ the migration as indexing downtime is not a big issue, but this still needs
+ to be coordinated with the migration process
+1. Syncing Redis, Gitaly, CI Postgres, Main Postgres, registry Postgres, other
+ new data stores snapshots in an online system would likely lead to gaps
+ without a long downtime. You need to choose a sync point and at the sync
+ point you need to stop writes to perform the migration. The more data stores
+ there are to migrate at the same time the longer the write downtime for the
+ failover. We would also need to find a reliable place in the application to
+ actually block updates to all these systems with a high degree of
+ confidence. In the past we've only been confident by shutting down all rails
+ services because any rails process could write directly to any of these at
+ any time due to async workloads or other surprising code paths.
+1. How to efficiently delete all the orphaned data. Locating all `ci_builds`
+ associated with half the organizations would be very expensive if we have to
+ do joins. We haven't yet determined if we'd want to store an `organization_id`
+ column on every table, but this is the kind of thing it would be helpful for.
+
### 3.2. Migrate organization from an existing Pod
This is different to split, as we intend to perform logical and selective replication
@@ -75,6 +115,14 @@ which Pod is authoritative for this organization.
1. It likely will require a full database structure analysis (more robust than project import/export)
to perform selective PostgreSQL logical replication.
+#### More challenges of this proposal
+
+1. Logical replication is still not performant enough to keep up with our
+ scale. Even if we could use logical replication we still don't have an
+ efficient way to filter data related to a single organization without
+ joining all the way to the `organizations` table which will slow down
+ logical replication dramatically.
+
## 4. Evaluation
## 4.1. Pros
diff --git a/doc/architecture/blueprints/pods/pods-feature-global-search.md b/doc/architecture/blueprints/pods/pods-feature-global-search.md
new file mode 100644
index 00000000000..5ea863ac646
--- /dev/null
+++ b/doc/architecture/blueprints/pods/pods-feature-global-search.md
@@ -0,0 +1,47 @@
+---
+stage: enablement
+group: pods
+comments: false
+description: 'Pods: Global search'
+---
+
+DISCLAIMER:
+This page may contain information related to upcoming products, features and
+functionality. It is important to note that the information presented is for
+informational purposes only, so please do not rely on the information for
+purchasing or planning purposes. Just like with all projects, the items
+mentioned on the page are subject to change or delay, and the development,
+release, and timing of any products, features, or functionality remain at the
+sole discretion of GitLab Inc.
+
+This document is a work-in-progress and represents a very early state of the
+Pods design. Significant aspects are not documented, though we expect to add
+them in the future. This is one possible architecture for Pods, and we intend to
+contrast this with alternatives before deciding which approach to implement.
+This documentation will be kept even if we decide not to implement this so that
+we can document the reasons for not choosing this approach.
+
+# Pods: Global search
+
+When we introduce multiple Pods we intend to isolate all services related to
+those Pods. This will include Elasticsearch which means our current global
+search functionality will not work. It may be possible to implement aggregated
+search across all pods, but it is unlikely to be performant to do fan-out
+searches across all pods especially once you start to do pagination which
+requires setting the correct offset and page number for each search.
+
+## 1. Definition
+
+## 2. Data flow
+
+## 3. Proposal
+
+Likely first versions of Pods will simply not support global searches and then
+we may later consider if building global searches to support popular use cases
+is worthwhile.
+
+## 4. Evaluation
+
+## 4.1. Pros
+
+## 4.2. Cons
diff --git a/doc/architecture/blueprints/pods/pods-feature-schema-changes.md b/doc/architecture/blueprints/pods/pods-feature-schema-changes.md
new file mode 100644
index 00000000000..ae7c288028d
--- /dev/null
+++ b/doc/architecture/blueprints/pods/pods-feature-schema-changes.md
@@ -0,0 +1,55 @@
+---
+stage: enablement
+group: pods
+comments: false
+description: 'Pods: Schema changes'
+---
+
+DISCLAIMER:
+This page may contain information related to upcoming products, features and
+functionality. It is important to note that the information presented is for
+informational purposes only, so please do not rely on the information for
+purchasing or planning purposes. Just like with all projects, the items
+mentioned on the page are subject to change or delay, and the development,
+release, and timing of any products, features, or functionality remain at the
+sole discretion of GitLab Inc.
+
+This document is a work-in-progress and represents a very early state of the
+Pods design. Significant aspects are not documented, though we expect to add
+them in the future. This is one possible architecture for Pods, and we intend to
+contrast this with alternatives before deciding which approach to implement.
+This documentation will be kept even if we decide not to implement this so that
+we can document the reasons for not choosing this approach.
+
+# Pods: Schema changes
+
+When we introduce multiple Pods that own their own databases this will
+complicate the process of making schema changes to Postgres and Elasticsearch.
+Today we already need to be careful to make changes comply with our zero
+downtime deployments. For example,
+[when removing a column we need to make changes over 3 separate deployments](../../../development/database/avoiding_downtime_in_migrations.md#dropping-columns).
+We have tooling like `post_migrate` that helps with these kinds of changes to
+reduce the number of merge requests needed, but these will be complicated when
+we are dealing with deploying multiple rails applications that will be at
+different versions at any one time. This problem will be particularly tricky to
+solve for shared databases like our plan to share the `users` related tables
+among all Pods.
+
+A key benefit of Pods may be that it allows us to run different
+customers on different versions of GitLab. We may choose to update our own pod
+before all our customers giving us even more flexibility than our current
+canary architecture. But doing this means that schema changes need to have even
+more versions of backward compatibility support which could slow down
+development as we need extra steps to make schema changes.
+
+## 1. Definition
+
+## 2. Data flow
+
+## 3. Proposal
+
+## 4. Evaluation
+
+## 4.1. Pros
+
+## 4.2. Cons
diff --git a/doc/user/clusters/agent/install/index.md b/doc/user/clusters/agent/install/index.md
index 2030052e3b0..62767f1dfd9 100644
--- a/doc/user/clusters/agent/install/index.md
+++ b/doc/user/clusters/agent/install/index.md
@@ -123,7 +123,7 @@ To install the agent on your cluster using Helm:
1. In your computer, open a terminal and [connect to your cluster](https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster/).
1. Run the command you copied when you [registered your agent with GitLab](#register-the-agent-with-gitlab).
-Optionally, you can [customize the Helm installation](#customize-the-helm-installation).
+Optionally, you can [customize the Helm installation](#customize-the-helm-installation). If you install the agent on a production system, you should customize the Helm installation to skip creating the service account.
##### Customize the Helm installation
diff --git a/doc/user/packages/terraform_module_registry/index.md b/doc/user/packages/terraform_module_registry/index.md
index 2b99ff807ec..d3e67175ca2 100644
--- a/doc/user/packages/terraform_module_registry/index.md
+++ b/doc/user/packages/terraform_module_registry/index.md
@@ -50,7 +50,7 @@ error `{"error":"404 Not Found"}`.
Example request using a personal access token:
```shell
-curl --header "PRIVATE-TOKEN: <your_access_token>" \
+curl --fail-with-body --header "PRIVATE-TOKEN: <your_access_token>" \
--upload-file path/to/file.tgz \
"https://gitlab.example.com/api/v4/projects/<your_project_id>/packages/terraform/modules/my-module/my-system/0.0.1/file"
```
@@ -66,7 +66,7 @@ Example response:
Example request using a deploy token:
```shell
-curl --header "DEPLOY-TOKEN: <deploy_token>" \
+curl --fail-with-body --header "DEPLOY-TOKEN: <deploy_token>" \
--upload-file path/to/file.tgz \
"https://gitlab.example.com/api/v4/projects/<your_project_id>/packages/terraform/modules/my-module/my-system/0.0.1/file"
```
@@ -127,7 +127,7 @@ upload:
script:
- TERRAFORM_MODULE_NAME=$(echo "${TERRAFORM_MODULE_NAME}" | tr " _" -) # module-name must not have spaces or underscores, so translate them to hyphens
- tar -vczf ${TERRAFORM_MODULE_NAME}-${TERRAFORM_MODULE_SYSTEM}-${TERRAFORM_MODULE_VERSION}.tgz -C ${TERRAFORM_MODULE_DIR} --exclude=./.git .
- - 'curl --location --header "JOB-TOKEN: ${CI_JOB_TOKEN}"
+ - 'curl --fail-with-body --location --header "JOB-TOKEN: ${CI_JOB_TOKEN}"
--upload-file ${TERRAFORM_MODULE_NAME}-${TERRAFORM_MODULE_SYSTEM}-${TERRAFORM_MODULE_VERSION}.tgz
${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/terraform/modules/${TERRAFORM_MODULE_NAME}/${TERRAFORM_MODULE_SYSTEM}/${TERRAFORM_MODULE_VERSION}/file'
rules: