Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2023-02-20 16:49:51 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2023-02-20 16:49:51 +0300
commit71786ddc8e28fbd3cb3fcc4b3ff15e5962a1c82e (patch)
tree6a2d93ef3fb2d353bb7739e4b57e6541f51cdd71 /doc/architecture
parenta7253423e3403b8c08f8a161e5937e1488f5f407 (diff)
Add latest changes from gitlab-org/gitlab@15-9-stable-eev15.9.0-rc42
Diffstat (limited to 'doc/architecture')
-rw-r--r--doc/architecture/blueprints/_template.md9
-rw-r--r--doc/architecture/blueprints/ci_data_decay/pipeline_partitioning.md42
-rw-r--r--doc/architecture/blueprints/ci_pipeline_components/index.md171
-rw-r--r--doc/architecture/blueprints/consolidating_groups_and_projects/index.md2
-rw-r--r--doc/architecture/blueprints/gitlab_agent_deployments/index.md403
-rw-r--r--doc/architecture/blueprints/object_storage/index.md2
-rw-r--r--doc/architecture/blueprints/rate_limiting/index.md2
-rw-r--r--doc/architecture/blueprints/remote_development/index.md2
-rw-r--r--doc/architecture/blueprints/runner_tokens/index.md208
-rw-r--r--doc/architecture/blueprints/search/code_search_with_zoekt.md305
-rw-r--r--doc/architecture/blueprints/work_items/index.md19
-rw-r--r--doc/architecture/index.md11
12 files changed, 1062 insertions, 114 deletions
diff --git a/doc/architecture/blueprints/_template.md b/doc/architecture/blueprints/_template.md
index e39c2b51a5b..f7dea60e9b7 100644
--- a/doc/architecture/blueprints/_template.md
+++ b/doc/architecture/blueprints/_template.md
@@ -43,6 +43,15 @@ a feature has become "implemented", major changes should get new blueprints.
The canonical place for the latest set of instructions (and the likely source
of this file) is [here](/doc/architecture/blueprints/_template.md).
+
+Blueprint statuses you can use:
+
+- "proposed"
+- "accepted"
+- "ongoing"
+- "implemented"
+- "rejected"
+
-->
# {+ Title of Blueprint +}
diff --git a/doc/architecture/blueprints/ci_data_decay/pipeline_partitioning.md b/doc/architecture/blueprints/ci_data_decay/pipeline_partitioning.md
index 261390d1d14..ebe3c72adfc 100644
--- a/doc/architecture/blueprints/ci_data_decay/pipeline_partitioning.md
+++ b/doc/architecture/blueprints/ci_data_decay/pipeline_partitioning.md
@@ -758,25 +758,25 @@ gantt
section Phase 0
Build data partitioning strategy :done, 0_1, 2022-06-01, 90d
section Phase 1
- Partition biggest CI tables :1_1, after 0_1, 140d
- Biggest table partitioned :milestone, metadata, 2022-12-01, 1min
+ Partition biggest CI tables :1_1, after 0_1, 200d
+ Biggest table partitioned :milestone, metadata, 2023-03-01, 1min
Tables larger than 100GB partitioned :milestone, 100gb, after 1_1, 1min
section Phase 2
- Add paritioning keys to SQL queries :2_1, after 1_1, 120d
+ Add paritioning keys to SQL queries :2_1, 2023-01-01, 120d
Emergency partition detachment possible :milestone, detachment, 2023-04-01, 1min
All SQL queries are routed to partitions :milestone, routing, after 2_1, 1min
section Phase 3
- Build new data access patterns :3_1, 2023-03-01, 120d
- New API endpoint created for inactive data :milestone, api1, 2023-05-01, 1min
- Filtering added to existing API endpoints :milestone, api2, 2023-07-01, 1min
+ Build new data access patterns :3_1, 2023-05-01, 120d
+ New API endpoint created for inactive data :milestone, api1, 2023-07-01, 1min
+ Filtering added to existing API endpoints :milestone, api2, 2023-09-01, 1min
section Phase 4
- Introduce time-decay mechanisms :4_1, 2023-06-01, 120d
- Inactive partitions are not being read :milestone, part1, 2023-08-01, 1min
- Performance of the database cluster improves :milestone, part2, 2023-09-01, 1min
+ Introduce time-decay mechanisms :4_1, 2023-08-01, 120d
+ Inactive partitions are not being read :milestone, part1, 2023-10-01, 1min
+ Performance of the database cluster improves :milestone, part2, 2023-11-01, 1min
section Phase 5
- Introduce auto-partitioning mechanisms :5_1, 2023-07-01, 120d
- New partitions are being created automatically :milestone, part3, 2023-10-01, 1min
- Partitioning is made available on self-managed :milestone, part4, 2023-11-01, 1min
+ Introduce auto-partitioning mechanisms :5_1, 2023-09-01, 120d
+ New partitions are being created automatically :milestone, part3, 2023-12-01, 1min
+ Partitioning is made available on self-managed :milestone, part4, 2024-01-01, 1min
```
## Conclusions
@@ -796,18 +796,16 @@ strategy here to share knowledge and solicit feedback from other team members.
## Who
-Authors:
+DRIs:
<!-- vale gitlab.Spelling = NO -->
-| Role | Who |
-|--------|----------------|
-| Author | Grzegorz Bizon |
-
-Recommenders:
-
-| Role | Who |
-|-------------------------------|-----------------|
-| Senior Distingiushed Engineer | Kamil Trzciński |
+| Role | Who |
+|---------------------|------------------------------------------------|
+| Author | Grzegorz Bizon, Principal Engineer |
+| Recommender | Kamil Trzciński, Senior Distingiushed Engineer |
+| Product Manager | James Heimbuck, Senior Product Manager |
+| Engineering Manager | Scott Hampton, Engineering Manager |
+| Lead Engineer | Marius Bobin, Senior Backend Engineer |
<!-- vale gitlab.Spelling = YES -->
diff --git a/doc/architecture/blueprints/ci_pipeline_components/index.md b/doc/architecture/blueprints/ci_pipeline_components/index.md
index 5bff794c683..b1aee7c4217 100644
--- a/doc/architecture/blueprints/ci_pipeline_components/index.md
+++ b/doc/architecture/blueprints/ci_pipeline_components/index.md
@@ -1,5 +1,5 @@
---
-status: proposed
+status: ongoing
creation-date: "2022-09-14"
authors: [ "@ayufan", "@fabiopitino", "@grzesiek" ]
coach: [ "@ayufan", "@grzesiek" ]
@@ -8,20 +8,22 @@ owning-stage: "~devops::verify"
participating-stages: []
---
-# CI/CD pipeline components catalog
+# CI/CD Catalog
## Summary
## Goals
-The goal of the CI/CD pipeline components catalog is to make the reusing pipeline configurations
-easier and more efficient.
-Providing a way to discover, understand and learn how to reuse pipeline constructs allows for a more streamlined experience.
-Having a CI/CD pipeline components catalog also sets a framework for users to collaborate on pipeline constructs so that they can be evolved
-and improved over time.
+The goal of the CI/CD pipeline components catalog is to make the reusing
+pipeline configurations easier and more efficient. Providing a way to
+discover, understand and learn how to reuse pipeline constructs allows for a
+more streamlined experience. Having a CI/CD pipeline components catalog also
+sets a framework for users to collaborate on pipeline constructs so that they
+can be evolved and improved over time.
-This blueprint defines the architectural guidelines on how to build a CI/CD catalog of pipeline components.
-This blueprint also defines the long-term direction for iterations and improvements to the solution.
+This blueprint defines the architectural guidelines on how to build a CI/CD
+catalog of pipeline components. This blueprint also defines the long-term
+direction for iterations and improvements to the solution.
## Challenges
@@ -55,7 +57,7 @@ This blueprint also defines the long-term direction for iterations and improveme
- The user should be able to import the job inside a given stage or pass the stage names as input parameter
when using the component.
- Failures in mapping the correct stage can result in confusing errors.
-- Some templates are designed to work with AutoDevops but are not generic enough
+- Some templates are designed to work with AutoDevOps but are not generic enough
([example](https://gitlab.com/gitlab-org/gitlab/-/blob/2c0e8e4470001442e999391df81e19732b3439e6/lib/gitlab/ci/templates/AWS/Deploy-ECS.gitlab-ci.yml)).
- Many CI templates, especially those [language specific](https://gitlab.com/gitlab-org/gitlab/-/tree/2c0e8e4470001442e999391df81e19732b3439e6/lib/gitlab/ci/templates)
are tutorial/scaffolding-style templates.
@@ -82,21 +84,7 @@ This blueprint also defines the long-term direction for iterations and improveme
- Competitive landscape is showing the need for such feature
- [R2DevOps](https://r2devops.io) implements a catalog of CI templates for GitLab pipelines.
- [GitHub Actions](https://github.com/features/actions) provides an extensive catalog of reusable job steps.
-
-## Implementation guidelines
-
-- Start with the smallest user base. Dogfood the feature for `gitlab-org` and `gitlab-com` groups.
- Involve the Engineering Productivity and other groups authoring pipeline configurations to test
- and validate our solutions.
-- Ensure we can integrate all the feedback gathered, even if that means changing the technical design or
- UX. Until we make the feature GA we should have clear expectations with early adopters.
-- Reuse existing functionality as much as possible. Don't reinvent the wheel on the initial iterations.
- For example: reuse project features like title, description, avatar to build a catalog.
-- Leverage GitLab features for the development lifecycle of the components (testing via `.gitlab-ci.yml`,
- release management, Pipeline Editor, etc.).
-- Design the catalog with self-managed support in mind.
-- Allow the catalog an the workflow to support future types of pipeline constructs and new ways of using them.
-- Design components and catalog following industry best practice related to building deterministic package managers.
+ - [CircleCI Orbs](https://circleci.com/orbs/) provide reusable YAML configuration packages.
## Glossary
@@ -158,38 +146,50 @@ unable to achieve it early iterations.
## Structure of a component
-A pipeline component is identified by the path to a repository or directory that defines it
-and a specific version: `<component-path>@<version>`.
+A pipeline component is identified by a unique address in the form `<fqdn>/<component-path>@<version>` containing:
+
+- FQDN (Fully Qualified Domain Name).
+- The path to a repository or directory that defines it.
+- A specific version
+
+For example: `gitlab.com/gitlab-org/dast@1.0`.
+
+### The FQDN
-For example: `gitlab-org/dast@1.0`.
+Initially we support only component addresses that point to the same GitLab instance, meaning that the FQDN matches
+the GitLab host.
### The component path
-A component path must contain at least the component YAML and optionally a
+The directory identified by the component path must contain at least the component YAML and optionally a
related `README.md` documentation file.
The component path can be:
-- A path to a project: `gitlab-org/dast`. The default component is processed.
-- A path to an explicit component: `gitlab-org/dast/api-scan`. In this case the explicit `api-scan` component is processed.
-- A path to a local directory: `/path/to/component`. This path must contain the component YAML that defines the component.
- The path must start with `/` to indicate a full path in the repository.
+- A path to a project: `gitlab.com/gitlab-org/dast`. The default component is processed.
+- A path to an explicit component: `gitlab.com/gitlab-org/dast/api-scan`. In this case the explicit `api-scan` component is processed.
+- A relative path to a local directory: `./path/to/component`. This path must contain the component YAML that defines the component.
+ The path must start with `./` or `../` to indicate a path relative to the current file's path.
+
+Relative local paths are a abbreviated form of the full component address, meaning that `./path/to/component` called from
+a file `mydir/file.yml` in `gitlab-org/dast` project would be expanded to:
+
+```plaintext
+gitlab.com/gitlab-org/dast/mydir/path/to/component@<CURRENT_SHA>
+```
The component YAML file follows the filename convention `<type>.yml` where component type is one of:
| Component type | Context |
| -------------- | ------- |
| `template` | For components used under `include:` keyword |
-| `step` | For components used under `steps:` keyword |
Based on the context where the component is used we fetch the correct YAML file.
For example:
-- if we are including a component `gitlab-org/dast@1.0` we expect a YAML file named `template.yml` in the
+- if we are including a component `gitlab.com/gitlab-org/dast@1.0` we expect a YAML file named `template.yml` in the
root directory of `gitlab-org/dast` repository.
-- if we are including a component `gitlab-org/dast/api-scan@1.0` we expect a YAML file named `template.yml` inside a
- directory `api-scan` of `gitlab-org/dast` repository.
-- if we are using a step component `gitlab-org/dast/api-scan@1.0` we expect a YAML file named `step.yml` inside a
+- if we are including a component `gitlab.com/gitlab-org/dast/api-scan@1.0` we expect a YAML file named `template.yml` inside a
directory `api-scan` of `gitlab-org/dast` repository.
A component YAML file:
@@ -225,11 +225,11 @@ even when not releasing versions in the catalog.
The version of the component can be (in order of highest priority first):
-1. A commit SHA - For example: `gitlab-org/dast@e3262fdd0914fa823210cdb79a8c421e2cef79d8`
-1. A released tag - For example: `gitlab-org/dast@1.0`
-1. A special moving target version that points to the most recent released tag - For example: `gitlab-org/dast@~latest`
-1. An unreleased tag - For example: `gitlab-org/dast@rc-1.0`
-1. A branch name - For example: `gitlab-org/dast@master`
+1. A commit SHA - For example: `gitlab.com/gitlab-org/dast@e3262fdd0914fa823210cdb79a8c421e2cef79d8`
+1. A released tag - For example: `gitlab.com/gitlab-org/dast@1.0`
+1. A special moving target version that points to the most recent released tag - For example: `gitlab.com/gitlab-org/dast@~latest`
+1. An unreleased tag - For example: `gitlab.com/gitlab-org/dast@rc-1.0`
+1. A branch name - For example: `gitlab.com/gitlab-org/dast@master`
If a tag and branch exist with the same name, the tag takes precedence over the branch.
Similarly, if a tag is named `e3262fdd0914fa823210cdb79a8c421e2cef79d8`, a commit SHA (if exists)
@@ -267,7 +267,7 @@ The following directory structure would support 1 component per project:
The `.gitlab-ci.yml` is recommended for the project to ensure changes are verified accordingly.
-The component is now identified by the path `myorg/rails-rspec` and we expect a `template.yml` file
+The component is now identified by the path `gitlab.com/myorg/rails-rspec` and we expect a `template.yml` file
and `README.md` located in the root directory of the repository.
The following directory structure would support multiple components per project:
@@ -287,8 +287,8 @@ The following directory structure would support multiple components per project:
In this example we are defining multiple test profiles that are executed with RSpec.
The user could choose to use one or more of these.
-Each of these components are identified by their path `myorg/rails-rspec/unit`, `myorg/rails-rspec/integration`
-and `myorg/rails-rspec/feature`.
+Each of these components are identified by their path `gitlab.com/myorg/rails-rspec/unit`, `gitlab.com/myorg/rails-rspec/integration`
+and `gitlab.com/myorg/rails-rspec/feature`.
This directory structure could also support both strategies:
@@ -302,12 +302,8 @@ This directory structure could also support both strategies:
│ └── template.yml # myorg/rails-rspec/unit
├── integration/
│ └── template.yml # myorg/rails-rspec/integration
-├── feature/
-│ └── template.yml # myorg/rails-rspec/feature
-└── report/
- ├── step.yml # myorg/rails-rspec/report
- ├── Dockerfile
- └── ... other files
+└── feature/
+ └── template.yml # myorg/rails-rspec/feature
```
With the above structure we could have a top-level component that can be used as the
@@ -329,6 +325,8 @@ spec:
website: # by default all declared inputs are mandatory.
environment:
default: test # apply default if not provided. This makes the input optional.
+ flags:
+ default: null # make an input entirely optional with no value by default.
test_run:
options: # a choice must be made from the list since there is no default value.
- unit
@@ -347,7 +345,7 @@ When using the component we pass the input parameters as follows:
```yaml
include:
- - component: org/my-component@1.0
+ - component: gitlab.com/org/my-component@1.0
with:
website: ${MY_WEBSITE} # variables expansion
test_run: system
@@ -359,7 +357,7 @@ possible [inputs provided upstream](#input-parameters-for-pipelines).
Input parameters are validated as soon as possible:
-1. Read the file `gitlab-template.yml` inside `org/my-component`.
+1. Read the file `gitlab-template.yml` inside `org/my-component` project.
1. Parse `spec:inputs` from the specifications and validate the parameters against this schema.
1. If successfully validated, proceed with parsing the content. Return an error otherwise.
1. Interpolate input parameters inside the component's content.
@@ -383,6 +381,32 @@ scan-website:
With `$[[ inputs.XXX ]]` inputs are interpolated immediately after parsing the content.
+### CI configuration interpolation perspectives and limitations
+
+With `spec:` users will be able to define input arguments for CI configuration.
+With `with:` keywords, they will pass these arguments to CI components.
+
+`inputs` in `$[[ inputs.something ]]` is going to be an initial "object" or
+"container" that we will provide, to allow users to access their arguments in
+the interpolation block. This, however, can evolve into myriads of directions, for example:
+
+1. We could provide `variables` or `env` object, for users to access their environment variables easier.
+1. We can extend the block evaluation to easier navigate JSON or YAML objects passed from elsewhere.
+1. We can provide access to the repository files, snippets or issues from there too.
+
+The CI configuration interpolation is a relative compute-intensive technology,
+especially because we foresee this mechanism being used frequently on
+GitLab.com. In order to ensure that users are using this responsibly, we have
+introduced various limits, required to keep our production system safe. The
+limits should not impact users, because there are application limits available
+on a different level (maximum YAML size supported, timeout on parsing YAML
+files etc); the interpolation limits we've introduced are typically much higher
+then these. Some of them are:
+
+1. An interpolation block should not be larger than 1 kilobyte.
+1. A YAML value with interpolation in it can't be larger than 1 megabyte.
+1. YAML configuration can't consist of more than half million entries.
+
### Why input parameters and not environment variables?
Until today we have been leveraging environment variables to pass information around.
@@ -407,7 +431,7 @@ extend it to other `include:` types support inputs via `with:` syntax:
```yaml
include:
- - component: org/my-component@1.0
+ - component: gitlab.com/org/my-component@1.0
with:
foo: bar
- local: path/to/file.yml
@@ -492,7 +516,7 @@ spec:
# rest of the pipeline config
```
-## Limits
+### Limits
Any MVC that exposes a feature should be added with limitations from the beginning.
It's safer to add new features with restrictions than trying to limit a feature after it's being used.
@@ -506,6 +530,41 @@ Some limits we could consider adding:
- max level of nested imports
- max length of the exported component name
+## Publishing components
+
+Users will be able to publish CI Components into a CI Catalog. This can happen
+in a CI pipeline job, similarly to how software is being deployed following
+Continuous Delivery principles. This will allow us to guardrail the quality of
+components being deployed. To ensure that the CI Components meet quality
+standards users will be able to test them before publishing new versions in the
+CI Catalog.
+
+Once a project containing components gets published we will index components'
+metadata. We want to initially index as much metadata as possible, to gain more
+flexibility in how we design CI Catalog's main page. We don't want to be
+constrained by the lack of data available to properly visualize CI Components
+in CI Catalog. In order to do that, we may need to find all components that are
+being published, read their `spec` metadata and index what we find there.
+
+## Implementation guidelines
+
+- Start with the smallest user base. Dogfood the feature for `gitlab-org` and
+ `gitlab-com` groups. Involve the Engineering Productivity and other groups
+ authoring pipeline configurations to test and validate our solutions.
+- Ensure we can integrate all the feedback gathered, even if that means
+ changing the technical design or UX. Until we make the feature GA we should
+ have clear expectations with early adopters.
+- Reuse existing functionality as much as possible. Don't reinvent the wheel on
+ the initial iterations. For example: reuse project features like title,
+ description, avatar to build a catalog.
+- Leverage GitLab features for the development lifecycle of the components
+ (testing via `.gitlab-ci.yml`, release management, Pipeline Editor, etc.).
+- Design the catalog with self-managed support in mind.
+- Allow the catalog and the workflow to support future types of pipeline
+ constructs and new ways of using them.
+- Design components and catalog following industry best practice related to
+ building deterministic package managers.
+
## Iterations
1. Experimentation phase
diff --git a/doc/architecture/blueprints/consolidating_groups_and_projects/index.md b/doc/architecture/blueprints/consolidating_groups_and_projects/index.md
index 0818d9b973d..97853075607 100644
--- a/doc/architecture/blueprints/consolidating_groups_and_projects/index.md
+++ b/doc/architecture/blueprints/consolidating_groups_and_projects/index.md
@@ -1,5 +1,5 @@
---
-status: proposed
+status: ongoing
creation-date: "2021-02-07"
authors: [ "@alexpooley", "@ifarkas" ]
coach: "@grzesiek"
diff --git a/doc/architecture/blueprints/gitlab_agent_deployments/index.md b/doc/architecture/blueprints/gitlab_agent_deployments/index.md
new file mode 100644
index 00000000000..96e361d7531
--- /dev/null
+++ b/doc/architecture/blueprints/gitlab_agent_deployments/index.md
@@ -0,0 +1,403 @@
+---
+status: proposed
+creation-date: "2022-11-23"
+authors: [ "@shinya.maeda" ]
+coach: "@DylanGriffith"
+approvers: [ "@nagyv-gitlab", "@cbalane", "@hustewart", "@hfyngvason" ]
+owning-stage: "~devops::release"
+participating-stages: [Configure, Release]
+---
+
+# View and manage resources deployed by GitLab Agent For Kuberenetes
+
+## Summary
+
+As part of the [GitLab Kubernetes Dashboard](https://gitlab.com/groups/gitlab-org/-/epics/2493) epic,
+users want to view and manage their resources deployed by GitLab Agent For Kuberenetes.
+Users should be able to interact with the resources through GitLab UI, such as Environment Index/Details page.
+
+This blueprint describes how the association is established and how these domain models interact with each other.
+
+## Motivation
+
+### Goals
+
+- The proposed architecture can be used in [GitLab Kubernetes Dashboard](https://gitlab.com/groups/gitlab-org/-/epics/2493).
+- The proposed architecture can be used in [Organization-level Environment dashboard](https://gitlab.com/gitlab-org/gitlab/-/issues/241506).
+- The cluster resources and events can be visualized per [GitLab Environment](../../../ci/environments/index.md).
+ An environment-specific view scoped to the resources managed either directly or indirectly by a deployment commit.
+- Support both [GitOps mode](../../../user/clusters/agent/gitops.md#gitops-configuration-reference) and [CI Access mode](../../../user/clusters/agent/ci_cd_workflow.md#authorize-the-agent).
+ - NOTE: At the moment, we focus on the solution for CI Access mode. GitOps mode will have significant architectural changes _outside of_ this blueprint,
+ such as [Flux switching](https://gitlab.com/gitlab-org/gitlab/-/issues/357947) and [Manifest projects outside of the Agent configuration project](https://gitlab.com/groups/gitlab-org/-/epics/7704). In order to derisk potential rework, we'll revisit the GitOps mode after these upstream changes have been settled.
+
+### Non-Goals
+
+- The design details of [GitLab Kubernetes Dashboard](https://gitlab.com/groups/gitlab-org/-/epics/2493) and [Organization-level Environment dashboard](https://gitlab.com/gitlab-org/gitlab/-/issues/241506).
+- Support Environment/Deployment features that rely on GitLab CI/CD pipelines, such as [Protected Environments](../../../ci/environments/protected_environments.md), [Deployment Approvals](../../../ci/environments/deployment_approvals.md), [Deployment safety](../../../ci/environments/deployment_safety.md), and [Environment rollback](../../../ci/environments/index.md#environment-rollback). These features are already available in CI Access mode, however, it's not available in GitOps mode.
+
+## Proposal
+
+### Overview
+
+- GitLab Environment and Agent-managed Resource Group have 1-to-1 relationship.
+- Agent-managed Resource Group tracks all resources produced by the connected [agent](../../../user/clusters/agent/index.md). This includes not only resources written in manifest files but also subsequently generated resources (e.g. `Pod`s created by `Deployment` manifest file).
+- Agent-managed Resource Group renders dependency graph, such as `Deployment` => `ReplicaSet` => `Pod`. This is for providing ArgoCD-style resource view.
+- Agent-managed Resource Group has the Resource Health status that represents a summary of resource statuses, such as `Healthy`, `Progressing` or `Degraded`.
+
+```mermaid
+flowchart LR
+ subgraph Kubernetes["Kubernetes"]
+ subgraph ResourceGroupProduction["ResourceGroup"]
+ direction LR
+ ResourceGroupProductionService(["Service"])
+ ResourceGroupProductionDeployment(["Deployment"])
+ ResourceGroupProductionPod1(["Pod1"])
+ ResourceGroupProductionPod2(["Pod2"])
+ end
+ subgraph ResourceGroupStaging["ResourceGroup"]
+ direction LR
+ ResourceGroupStagingService(["Service"])
+ ResourceGroupStagingDeployment(["Deployment"])
+ ResourceGroupStagingPod1(["Pod1"])
+ ResourceGroupStagingPod2(["Pod2"])
+ end
+ end
+
+ subgraph GitLab
+ subgraph Organization
+ subgraph Project
+ environment1["production environment"]
+ environment2["staging environment"]
+ end
+ end
+ end
+
+ environment1 --- ResourceGroupProduction
+ environment2 --- ResourceGroupStaging
+ ResourceGroupProductionService -.- ResourceGroupProductionDeployment
+ ResourceGroupProductionDeployment -.- ResourceGroupProductionPod1
+ ResourceGroupProductionDeployment -.- ResourceGroupProductionPod2
+ ResourceGroupStagingService -.- ResourceGroupStagingDeployment
+ ResourceGroupStagingDeployment -.- ResourceGroupStagingPod1
+ ResourceGroupStagingDeployment -.- ResourceGroupStagingPod2
+```
+
+### Existing components and relationships
+
+- [GitLab Project](../../../user/project/working_with_projects.md) and GitLab Environment have 1-to-many relationship.
+- GitLab Project and Agent have 1-to-many _direct_ relationship. Only one project can own a specific agent.
+- [GitOps mode](../../../user/clusters/agent/gitops.md#gitops-configuration-reference)
+ - GitLab Project and Agent do _NOT_ have many-to-many _indirect_ relationship yet. This will be supported in [Manifest projects outside of the Agent configuration project](https://gitlab.com/groups/gitlab-org/-/epics/7704).
+ - Agent and Agent-managed Resource Group have 1-to-1 relationship. Inventory IDs are used to group Kubernetes resources. This might be changed in [Flux switching](https://gitlab.com/gitlab-org/gitlab/-/issues/357947).
+- [CI Access mode](../../../user/clusters/agent/ci_cd_workflow.md#authorize-the-agent)
+ - GitLab Project and Agent have many-to-many _indirect_ relationship. The project owning the agent can [share the access with the other proejcts](../../../user/clusters/agent/ci_cd_workflow.md#authorize-the-agent-to-access-projects-in-your-groups). (NOTE: Technically, only running jobs inside the project are allowed to access the cluster due to job-token authentication.)
+ - Agent and Agent-managed Resource Group do _NOT_ have relationships yet.
+
+### Issues
+
+- Agent-managed Resource Group should have environment ID as the foreign key, which must be unique across resource groups.
+- Agent-managed Resource Group should have parameters how to group resources in the associated cluster, for example, `namespace`, `lable` and `inventory-id` (GitOps mode only) can passed as parameters.
+- Agent-managed Resource Group should be able to fetch all relevant resources, including both default resource kinds and other [Custom Resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/).
+- Agent-managed Resource Group should be aware of dependency graph.
+- Agent-managed Resource Group should be able to compute Resource Health status from the associated resources.
+
+### Example: Pull-based deployment (GitOps mode)
+
+NOTE:
+At the moment, we focus on the solution for CI Access mode. GitOps mode will have significant architectural changes _outside of_ this blueprint,
+such as [Flux switching](https://gitlab.com/gitlab-org/gitlab/-/issues/357947) and [Manifest projects outside of the Agent configuration project](https://gitlab.com/groups/gitlab-org/-/epics/7704). In order to derisk potential rework, we'll revisit the GitOps mode after these upstream changes have been settled.
+
+### Example: Push-based deployment (CI access mode)
+
+This is an example of how the architecture works in push-based deployment.
+The feature is documented [here](../../../user/clusters/agent/ci_cd_workflow.md) as CI access mode.
+
+```mermaid
+flowchart LR
+ subgraph ProductionKubernetes["Production Kubernetes"]
+ subgraph ResourceGroupProductionFrontend["ResourceGroup"]
+ direction LR
+ ResourceGroupProductionFrontendService(["Service"])
+ ResourceGroupProductionFrontendDeployment(["Deployment"])
+ ResourceGroupProductionFrontendPod1(["Pod1"])
+ ResourceGroupProductionFrontendPod2(["Pod2"])
+ end
+ subgraph ResourceGroupProductionBackend["ResourceGroup"]
+ direction LR
+ ResourceGroupProductionBackendService(["Service"])
+ ResourceGroupProductionBackendDeployment(["Deployment"])
+ ResourceGroupProductionBackendPod1(["Pod1"])
+ ResourceGroupProductionBackendPod2(["Pod2"])
+ end
+ subgraph ResourceGroupProductionPrometheus["ResourceGroup"]
+ direction LR
+ ResourceGroupProductionPrometheusService(["Service"])
+ ResourceGroupProductionPrometheusDeployment(["Deployment"])
+ ResourceGroupProductionPrometheusPod1(["Pod1"])
+ ResourceGroupProductionPrometheusPod2(["Pod2"])
+ end
+ end
+
+ subgraph GitLab
+ subgraph Organization
+ subgraph OperationGroup
+ subgraph AgentManagementProject
+ AgentManagementAgentProduction["Production agent"]
+ AgentManagementManifestFiles["Kubernetes Manifest Files"]
+ AgentManagementEnvironmentProductionPrometheus["production prometheus environment"]
+ AgentManagementPipelines["CI/CD pipelines"]
+ end
+ end
+ subgraph DevelopmentGroup
+ subgraph FrontendAppProject
+ FrontendAppCode["VueJS"]
+ FrontendDockerfile["Dockerfile"]
+ end
+ subgraph BackendAppProject
+ BackendAppCode["Golang"]
+ BackendDockerfile["Dockerfile"]
+ end
+ subgraph DeploymentProject
+ DeploymentManifestFiles["Kubernetes Manifest Files"]
+ DeploymentPipelines["CI/CD pipelines"]
+ DeploymentEnvironmentProductionFrontend["production frontend environment"]
+ DeploymentEnvironmentProductionBackend["production backend environment"]
+ end
+ end
+ end
+ end
+
+ DeploymentEnvironmentProductionFrontend --- ResourceGroupProductionFrontend
+ DeploymentEnvironmentProductionBackend --- ResourceGroupProductionBackend
+ AgentManagementEnvironmentProductionPrometheus --- ResourceGroupProductionPrometheus
+ ResourceGroupProductionFrontendService -.- ResourceGroupProductionFrontendDeployment
+ ResourceGroupProductionFrontendDeployment -.- ResourceGroupProductionFrontendPod1
+ ResourceGroupProductionFrontendDeployment -.- ResourceGroupProductionFrontendPod2
+ ResourceGroupProductionBackendService -.- ResourceGroupProductionBackendDeployment
+ ResourceGroupProductionBackendDeployment -.- ResourceGroupProductionBackendPod1
+ ResourceGroupProductionBackendDeployment -.- ResourceGroupProductionBackendPod2
+ ResourceGroupProductionPrometheusService -.- ResourceGroupProductionPrometheusDeployment
+ ResourceGroupProductionPrometheusDeployment -.- ResourceGroupProductionPrometheusPod1
+ ResourceGroupProductionPrometheusDeployment -.- ResourceGroupProductionPrometheusPod2
+ AgentManagementAgentProduction -- Shared with --- DeploymentProject
+ DeploymentPipelines -- "Deploy" --> ResourceGroupProductionFrontend
+ DeploymentPipelines -- "Deploy" --> ResourceGroupProductionBackend
+ AgentManagementPipelines -- "Deploy" --> ResourceGroupProductionPrometheus
+```
+
+### Further details
+
+#### Multi-Project Deployment Pipelines
+
+The microservice project setup can be improved by [Multi-Project Deployment Pipelines](https://gitlab.com/groups/gitlab-org/-/epics/8483):
+
+- Deployment Project can behave as the shared deployment engine for any upstream application projects and environments.
+- Environments can be created within the application projects. It gives more visibility of environments for developers.
+- Deployment Project can be managed under Operator group. More segregation of duties.
+- Users don't need to setup [RBAC to restrict CI/CD jobs](../../../user/clusters/agent/ci_cd_workflow.md#restrict-project-and-group-access-by-using-impersonation).
+- This is especitially helpful for [dynamic environments](../../../ci/environments/index.md#create-a-dynamic-environment), such as Review Apps.
+
+```mermaid
+flowchart LR
+ subgraph ProductionKubernetes["Production Kubernetes"]
+ subgraph ResourceGroupProductionFrontend["ResourceGroup"]
+ direction LR
+ ResourceGroupProductionFrontendService(["Service"])
+ ResourceGroupProductionFrontendDeployment(["Deployment"])
+ ResourceGroupProductionFrontendPod1(["Pod1"])
+ ResourceGroupProductionFrontendPod2(["Pod2"])
+ end
+ subgraph ResourceGroupProductionBackend["ResourceGroup"]
+ direction LR
+ ResourceGroupProductionBackendService(["Service"])
+ ResourceGroupProductionBackendDeployment(["Deployment"])
+ ResourceGroupProductionBackendPod1(["Pod1"])
+ ResourceGroupProductionBackendPod2(["Pod2"])
+ end
+ subgraph ResourceGroupProductionPrometheus["ResourceGroup"]
+ direction LR
+ ResourceGroupProductionPrometheusService(["Service"])
+ ResourceGroupProductionPrometheusDeployment(["Deployment"])
+ ResourceGroupProductionPrometheusPod1(["Pod1"])
+ ResourceGroupProductionPrometheusPod2(["Pod2"])
+ end
+ end
+
+ subgraph GitLab
+ subgraph Organization
+ subgraph OperationGroup
+ subgraph DeploymentProject
+ DeploymentAgentProduction["Production agent"]
+ DeploymentManifestFiles["Kubernetes Manifest Files"]
+ DeploymentEnvironmentProductionPrometheus["production prometheus environment"]
+ DeploymentPipelines["CI/CD pipelines"]
+ end
+ end
+ subgraph DevelopmentGroup
+ subgraph FrontendAppProject
+ FrontendDeploymentPipelines["CI/CD pipelines"]
+ FrontendEnvironmentProduction["production environment"]
+ end
+ subgraph BackendAppProject
+ BackendDeploymentPipelines["CI/CD pipelines"]
+ BackendEnvironmentProduction["production environment"]
+ end
+ end
+ end
+ end
+
+ FrontendEnvironmentProduction --- ResourceGroupProductionFrontend
+ BackendEnvironmentProduction --- ResourceGroupProductionBackend
+ DeploymentEnvironmentProductionPrometheus --- ResourceGroupProductionPrometheus
+ ResourceGroupProductionFrontendService -.- ResourceGroupProductionFrontendDeployment
+ ResourceGroupProductionFrontendDeployment -.- ResourceGroupProductionFrontendPod1
+ ResourceGroupProductionFrontendDeployment -.- ResourceGroupProductionFrontendPod2
+ ResourceGroupProductionBackendService -.- ResourceGroupProductionBackendDeployment
+ ResourceGroupProductionBackendDeployment -.- ResourceGroupProductionBackendPod1
+ ResourceGroupProductionBackendDeployment -.- ResourceGroupProductionBackendPod2
+ ResourceGroupProductionPrometheusService -.- ResourceGroupProductionPrometheusDeployment
+ ResourceGroupProductionPrometheusDeployment -.- ResourceGroupProductionPrometheusPod1
+ ResourceGroupProductionPrometheusDeployment -.- ResourceGroupProductionPrometheusPod2
+ FrontendDeploymentPipelines -- "Trigger downstream pipeline" --> DeploymentProject
+ BackendDeploymentPipelines -- "Trigger downstream pipeline" --> DeploymentProject
+ DeploymentPipelines -- "Deploy" --> ResourceGroupProductionFrontend
+ DeploymentPipelines -- "Deploy" --> ResourceGroupProductionBackend
+```
+
+#### View all Agent-managed Resource Groups on production environment
+
+At the group-level, we can accumulate all environments match a specific tier, for example,
+listing all environments with `production` tier from subsequent projects.
+This is useful to see the entire Agent-managed Resource Groups on production environment.
+The following diagram examplifies the relationship between GitLab group and Kubernetes resources:
+
+```mermaid
+flowchart LR
+ subgraph Kubernetes["Kubernetes"]
+ subgraph ResourceGroupProduction["ResourceGroup"]
+ direction LR
+ ResourceGroupProductionService(["Service"])
+ ResourceGroupProductionDeployment(["Deployment"])
+ ResourceGroupProductionPod1(["Pod1"])
+ ResourceGroupProductionPod2(["Pod2"])
+ end
+ subgraph ResourceGroupStaging["ResourceGroup"]
+ direction LR
+ ResourceGroupStagingService(["Service"])
+ ResourceGroupStagingDeployment(["Deployment"])
+ ResourceGroupStagingPod1(["Pod1"])
+ ResourceGroupStagingPod2(["Pod2"])
+ end
+ end
+
+ subgraph GitLab
+ subgraph Organization
+ OrganizationProduction["All resources on production"]
+ subgraph Frontend project
+ FrontendEnvironmentProduction["production environment"]
+ end
+ subgraph Backend project
+ BackendEnvironmentProduction["production environment"]
+ end
+ end
+ end
+
+ FrontendEnvironmentProduction --- ResourceGroupProduction
+ BackendEnvironmentProduction --- ResourceGroupStaging
+ ResourceGroupProductionService -.- ResourceGroupProductionDeployment
+ ResourceGroupProductionDeployment -.- ResourceGroupProductionPod1
+ ResourceGroupProductionDeployment -.- ResourceGroupProductionPod2
+ ResourceGroupStagingService -.- ResourceGroupStagingDeployment
+ ResourceGroupStagingDeployment -.- ResourceGroupStagingPod1
+ ResourceGroupStagingDeployment -.- ResourceGroupStagingPod2
+ OrganizationProduction --- FrontendEnvironmentProduction
+ OrganizationProduction --- BackendEnvironmentProduction
+```
+
+A few notes:
+
+- In the future, we'd have more granular filters for resource search.
+ For example, there are two environments `production/us-region` and `production/eu-region` in each project
+ and show only resources in US region at the group-level.
+ This could be achivable by query filtering in PostgreSQL or label/namespace filtering in Kubernetes.
+- Please see [Add dynamically populated organization-level environments page](https://gitlab.com/gitlab-org/gitlab/-/issues/241506) for more information.
+
+## Design and implementation details
+
+NOTE:
+The following solution might be only applicable for CI Access mode. GitOps mode will have significant architectural changes _outside of_ this blueprint,
+such as [Flux switching](https://gitlab.com/gitlab-org/gitlab/-/issues/357947) and [Manifest projects outside of the Agent configuration project](https://gitlab.com/groups/gitlab-org/-/epics/7704). In order to derisk potential rework, we'll revisit the GitOps mode after these upstream changes have been settled.
+
+### Associate Environment with Agent
+
+As a preliminary step, we allow users to explicitly define "which deployment job" uses "which agent" and deploy to "which namespace". The following keywords are supported in `.gitlab-ci.yml`.
+
+- `environment:kubernetes:agent` ... Define which agent the deployment job uses. It can select the appropriate context from the `KUBE_CONFIG`.
+- `environment:kubernetes:namespace` ... Define which namespace the deployment job deploys to. It injects `KUBE_NAMESPACE` predefined variable into the job. This keyword already [exists](../../../ci/yaml/index.md#environmentkubernetes).
+
+Here is an example of `.gitlab-ci.yml`.
+
+```yaml
+deploy-production:
+ environment:
+ name: production
+ kubernetes:
+ agent: path/to/agent/repository:agent-name
+ namespace: default
+ script:
+ - helm --context="$KUBE_CONTEXT" --namespace="$KUBE_NAMESPACE" upgrade --install
+```
+
+When a deployment job is created, GitLab persists the relationship of specified agent, namespace and deployment job. If the CI job is NOT authorized to access the agent (Please refer [`Clusters::Agents::FilterAuthorizationsService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/clusters/agents/filter_authorizations_service.rb) for more details), this relationship aren't recorded. This process happens in [`Deployments::CreateForBuildService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/deployments/create_for_build_service.rb). The database table scheme is:
+
+```plaintext
+agent_deployments:
+ - deployment_id (bigint/FK/NOT NULL/Unique)
+ - agent_id (bigint/FK/NOT NULL)
+ - kubernetes_namespace (character varying(255)/NOT NULL)
+```
+
+To idenfity an associated agent for a specific environment, `environment.last_deployment.agent` can be used in Rails.
+
+### Fetch resources through `user_access`
+
+When user visits an environment page, GitLab frontend fetches an environment via GraphQL. Frontend additionally fetches the associated agent-ID and namespace through deployment relationship, which being tracked by the `agent_deployments` table.
+
+Here is an example of GraphQL query:
+
+```graphql
+{
+ project(fullPath: "group/project") {
+ id
+ environment(name: "<environment-name>") {
+ slug
+ lastDeployment(status: SUCCESS) {
+ agent {
+ id
+ kubernetesNamespace
+ }
+ }
+ }
+ }
+}
+```
+
+GitLab frontend authenticate/authorize the user access with [browser cookie](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/kubernetes_user_access.md#browser-cookie-on-gitlab-frontend). If the access is forbidden, frontend shows an error message that `You don't have access to an agent that deployed to this environment. Please contact agent administrator if you are allowed in "user_access" in agent config file. See <troubleshooting-doc-link>`.
+
+After the user gained access to the agent, GitLab frontend fetches available API Resource list in the Kubernetes and fetches the resources with the following parameters:
+
+- `namespace` ... `#{environment.lastDeployment.agent.kubernetesNamespace}`
+- `labels`
+ - `app.gitlab.com/project_id=#{project.id}` _AND_
+ - `app.gitlab.com/environment_slug: #{environment.slug}`
+
+If no resources are found, this is likely that the users have not embedded these lables into their resources. In this case, frontend shows an warning message `There are no resources found for the environment. Do resources have GitLab preserved labels? See <troubleshooting-doc-link>`.
+
+### Dependency graph
+
+- GitLab frontend uses [Owner References](https://kubernetes.io/docs/concepts/overview/working-with-objects/owners-dependents/) to idenfity the dependencies between resources. These are embedded in resources as `metadata.ownerReferences` field.
+- For the resoruces that don't have owner references, we can use [Well-Known Labels, Annotations and Taints](https://kubernetes.io/docs/reference/labels-annotations-taints/) as complement. e.g. `EndpointSlice` doesn't have `metadata.ownerReferences`, but has `kubernetes.io/service-name` as a reference to the parent `Service` resource.
+
+### Health status of resources
+
+- GitLab frontend computes the status summary from the fetched resources. Something similar to ArgoCD's [Resource Health](https://argo-cd.readthedocs.io/en/stable/operator-manual/health/) e.g. `Healthy`, `Progressing`, `Degraded` and `Suspended`. The formula is TBD.
diff --git a/doc/architecture/blueprints/object_storage/index.md b/doc/architecture/blueprints/object_storage/index.md
index 950a5f13c38..4a8eeaf86a9 100644
--- a/doc/architecture/blueprints/object_storage/index.md
+++ b/doc/architecture/blueprints/object_storage/index.md
@@ -1,5 +1,5 @@
---
-status: ready
+status: accepted
creation-date: "2021-11-18"
authors: [ "@nolith" ]
coach: "@glopezfernandez"
diff --git a/doc/architecture/blueprints/rate_limiting/index.md b/doc/architecture/blueprints/rate_limiting/index.md
index 26114fce7f2..b466a54e922 100644
--- a/doc/architecture/blueprints/rate_limiting/index.md
+++ b/doc/architecture/blueprints/rate_limiting/index.md
@@ -1,5 +1,5 @@
---
-status: ready
+status: accepted
creation-date: "2022-09-08"
authors: [ "@grzesiek", "@marshall007", "@fabiopitino", "@hswimelar" ]
coach: "@andrewn"
diff --git a/doc/architecture/blueprints/remote_development/index.md b/doc/architecture/blueprints/remote_development/index.md
index 38c3782f5e3..162ae04f6b6 100644
--- a/doc/architecture/blueprints/remote_development/index.md
+++ b/doc/architecture/blueprints/remote_development/index.md
@@ -76,7 +76,7 @@ The current production [Web IDE](../../../user/project/web_ide/index.md).
An advanced editor with commit staging that currently supports:
-- [Live Preview](../../../user/project/web_ide/index.md#live-preview)
+- [Live Preview](../../../user/project/web_ide/index.md#live-preview-removed)
- [Interactive Web Terminals](../../../user/project/web_ide/index.md#interactive-web-terminals-for-the-web-ide)
### Web IDE
diff --git a/doc/architecture/blueprints/runner_tokens/index.md b/doc/architecture/blueprints/runner_tokens/index.md
index 7d21dd594ad..69a10674d7d 100644
--- a/doc/architecture/blueprints/runner_tokens/index.md
+++ b/doc/architecture/blueprints/runner_tokens/index.md
@@ -1,5 +1,5 @@
---
-status: ready
+status: ongoing
creation-date: "2022-10-27"
authors: [ "@pedropombeiro", "@tmaczukin" ]
coach: "@ayufan"
@@ -46,11 +46,38 @@ runner in supported environments using the existing `gitlab-runner register` com
The remaining concerns become non-issues due to the elimination of the registration token.
+### Comparison of current and new runner registration flow
+
+```mermaid
+graph TD
+ subgraph new[<b>New registration flow</b>]
+ A[<b>GitLab</b>: User creates a runner in GitLab UI and adds the runner configuration] -->|<b>GitLab</b>: creates ci_runners record and returns<br/>new 'glrt-' prefixed authentication token| B
+ B(<b>Runner</b>: User runs 'gitlab-runner register' command with</br>authentication token to register new runner machine with<br/>the GitLab instance) --> C{<b>Runner</b>: Does a .runner_system_id file exist in<br/>the gitlab-runner configuration directory?}
+ C -->|Yes| D[<b>Runner</b>: Reads existing system ID] --> F
+ C -->|No| E[<b>Runner</b>: Generates and persists unique system ID] --> F
+ F[<b>Runner</b>: Issues 'POST /runner/verify' request<br/>to verify authentication token validity] --> G{<b>GitLab</b>: Is the authentication token valid?}
+ G -->|Yes| H[<b>GitLab</b>: Creates ci_runner_machine database record if missing] --> J[<b>Runner</b>: Store authentication token in .config.toml]
+ G -->|No| I(<b>GitLab</b>: Returns '403 Forbidden' error) --> K(gitlab-runner register command fails)
+ J --> Z(Runner and runner machine are ready for use)
+ end
+
+ subgraph current[<b>Current registration flow</b>]
+ A'[<b>GitLab</b>: User retrieves runner registration token in GitLab UI] --> B'
+ B'[<b>Runner</b>: User runs 'gitlab-runner register' command<br/>with registration token to register new runner] -->|<b>Runner</b>: Issues 'POST /runner request' to create<br/>new runner and obtain authentication token| C'{<b>GitLab</b>: Is the registration token valid?}
+ C' -->|Yes| D'[<b>GitLab</b>: Create ci_runners database record] --> F'
+ C' -->|No| E'(<b>GitLab</b>: Return '403 Forbidden' error) --> K'(gitlab-runner register command fails)
+ F'[<b>Runner</b>: Store authentication token<br/>from response in .config.toml] --> Z'(Runner is ready for use)
+ end
+
+ style new fill:#f2ffe6
+```
+
### Using the authentication token in place of the registration token
<!-- vale gitlab.Spelling = NO -->
-In this proposal, runners created in the GitLab UI are assigned authentication tokens prefixed with
-`glrt-` (**G**it**L**ab **R**unner **T**oken).
+In this proposal, runners created in the GitLab UI are assigned
+[authentication tokens](../../../security/token_overview.md#runner-authentication-tokens-also-called-runner-tokens)
+prefixed with `glrt-` (**G**it**L**ab **R**unner **T**oken).
<!-- vale gitlab.Spelling = YES -->
The prefix allows the existing `register` command to use the authentication token _in lieu_
of the current registration token (`--registration-token`), requiring minimal adjustments in
@@ -68,8 +95,8 @@ token in the `--registration-token` argument:
| Token type | Behavior |
| ---------- | -------- |
-| Registration token | Leverages the `POST /api/v4/runners` REST endpoint to create a new runner, creating a new entry in `config.toml`. |
-| Authentication token | Leverages the `POST /api/v4/runners/verify` REST endpoint to ensure the validity of the authentication token. Creates an entry in `config.toml` file and a `system_id` value in a sidecar file if missing (`.runner_system_id`). |
+| [Registration token](../../../security/token_overview.md#runner-authentication-tokens-also-called-runner-tokens) | Leverages the `POST /api/v4/runners` REST endpoint to create a new runner, creating a new entry in `config.toml`. |
+| [Authentication token](../../../security/token_overview.md#runner-authentication-tokens-also-called-runner-tokens) | Leverages the `POST /api/v4/runners/verify` REST endpoint to ensure the validity of the authentication token. Creates an entry in `config.toml` file and a `system_id` value in a sidecar file if missing (`.runner_system_id`). |
### Transition period
@@ -82,17 +109,18 @@ This approach reduces disruption to users responsible for deploying runners.
### Reusing the runner authentication token across many machines
-In the existing model, a new runner is created whenever a new worker is required. This
-has led to many situations where runners are left behind and become stale.
+In the existing autoscaling model, a new runner is created whenever a new job needs to be executed.
+This has led to many situations where runners are left behind and become stale.
In the proposed model, a `ci_runners` table entry describes a configuration that the user can reuse
-across multiple machines.
+across multiple machines, and runner state from each machine (for example, IP address, platform,
+or architecture) is moved to a separate table (`ci_runner_machines`).
A unique system identifier is [generated automatically](#generating-a-system_id-value) whenever the
runner application starts up or the configuration is saved.
-This allows differentiating the context in which the runner is being used.
+This allows differentiating the machine in which the runner is being used.
-The `system_id` value complements the short runner token that is currently used to identify a
-runner in command line output, CI job logs, and GitLab UI.
+The `system_id` value complements the short runner token that is used to identify a runner in
+command line output, CI job logs, and GitLab UI.
Given that the creation of runners involves user interaction, it should be possible
to eventually lower the per-plan limit of CI runners that can be registered per scope.
@@ -166,7 +194,7 @@ CREATE TABLE ci_builds_metadata (
CREATE TABLE ci_runner_machines (
id bigint NOT NULL,
- machine_xid character varying UNIQUE NOT NULL,
+ system_xid character varying UNIQUE NOT NULL,
contacted_at timestamp without time zone,
version character varying,
revision character varying,
@@ -241,7 +269,7 @@ future after the legacy registration system is removed, and runners have been up
versions.
Job pings from such legacy runners results in a `ci_runner_machines` record containing a
-`<legacy>` `machine_xid` field value.
+`<legacy>` `system_xid` field value.
Not using the unique system ID means that all connected runners with the same token are
notified, instead of just the runner matching the exact system identifier. While not ideal, this is
@@ -295,8 +323,13 @@ enum column created in the `ci_runners` table.
### Runner creation through API
-Automated runner creation may be allowed, although always through authenticated API calls -
-using PAT tokens for example - such that every runner is associated with an owner.
+Automated runner creation is possible through a new GraphQL mutation and the existing
+[`POST /runners` REST API endpoint](../../../api/runners.md#register-a-new-runner).
+The difference in the REST API endpoint is that it is modified to accept a request from an
+authorized user with a scope (instance, a group, or a project) instead of the registration token.
+These endpoints are only available to users that are
+[allowed](../../../user/permissions.md#gitlab-cicd-permissions) to create runners at the specified
+scope.
## Implementation plan
@@ -315,10 +348,17 @@ using PAT tokens for example - such that every runner is associated with an owne
| Component | Milestone | Changes |
|------------------|----------:|---------|
| GitLab Runner | `15.7` | Ensure a sidecar TOML file exists with a `system_id` value.<br/>Log new system ID values with `INFO` level as they get assigned. |
-| GitLab Runner | `15.7` | Log unique system ID in the build logs. |
+| GitLab Runner | `15.9` | Log unique system ID in the build logs. |
| GitLab Runner | `15.9` | Label Prometheus metrics with unique system ID. |
| GitLab Runner | `15.8` | Prepare `register` command to fail if runner server-side configuration options are passed together with a new `glrt-` token. |
+### Stage 2a - Prepare GitLab Runner Helm Chart and GitLab Runner Operator
+
+| Component | Milestone | Issue | Changes |
+|------------------|----------:|-------|---------|
+|GitLab Runner Helm Chart| `%15.10` | Update the Runner Helm Chart to support registration with the authentication token. |
+|GitLab Runner Operator| `%15.10` | Update the Runner Operator to support registration with the authentication token. |
+
### Stage 3 - Database changes
| Component | Milestone | Changes |
@@ -327,33 +367,51 @@ using PAT tokens for example - such that every runner is associated with an owne
| GitLab Rails app | `%15.8` | Create database migration to add `ci_runner_machines` table. |
| GitLab Rails app | `%15.9` | Create database migration to add `ci_runner_machines.id` foreign key to `ci_builds_metadata` table. |
| GitLab Rails app | `%15.8` | Create database migrations to add `allow_runner_registration_token` setting to `application_settings` and `namespace_settings` tables (default: `true`). |
-| GitLab Rails app | `%15.8` | Create database migration to add config column to `ci_runner_machines` table. |
-| GitLab Runner | | Start sending `system_id` value in `POST /jobs/request` request and other follow-up requests that require identifying the unique system. |
-| GitLab Rails app | | Create service similar to `StaleGroupRunnersPruneCronWorker` service to clean up `ci_runner_machines` records instead of `ci_runners` records.<br/>Existing service continues to exist but focuses only on legacy runners. |
-| GitLab Rails app | | Create `ci_runner_machines` record in `POST /runners/verify` request if the runner token is prefixed with `glrt-`. |
-| GitLab Rails app | | Use runner token + `system_id` JSON parameters in `POST /jobs/request` request in the [heartbeat request](https://gitlab.com/gitlab-org/gitlab/blob/c73c96a8ffd515295842d72a3635a8ae873d688c/lib/api/ci/helpers/runner.rb#L14-20) to update the `ci_runner_machines` cache/table. |
-
-### Stage 4 - New UI
+| GitLab Rails app | `%15.8` | Create database migration to add `config` column to `ci_runner_machines` table. |
+| GitLab Runner | `%15.9` | Start sending `system_id` value in `POST /jobs/request` request and other follow-up requests that require identifying the unique system. |
+| GitLab Rails app | `%15.9` | Create service similar to `StaleGroupRunnersPruneCronWorker` service to clean up `ci_runner_machines` records instead of `ci_runners` records.<br/>Existing service continues to exist but focuses only on legacy runners. |
+| GitLab Rails app | `%15.9` | [Feature flag] Rollout of `create_runner_machine`. |
+| GitLab Rails app | `%15.9` | Create `ci_runner_machines` record in `POST /runners/verify` request if the runner token is prefixed with `glrt-`. |
+| GitLab Rails app | `%15.9` | Use runner token + `system_id` JSON parameters in `POST /jobs/request` request in the [heartbeat request](https://gitlab.com/gitlab-org/gitlab/blob/c73c96a8ffd515295842d72a3635a8ae873d688c/lib/api/ci/helpers/runner.rb#L14-20) to update the `ci_runner_machines` cache/table. |
+| GitLab Rails app | `%15.9` | [Feature flag] Enable runner creation workflow (`create_runner_workflow`). |
+| GitLab Rails app | `%15.9` | Implement `create_{instance|group|project}_runner` permissions. |
+| GitLab Rails app | `%15.9` | Rename `ci_runner_machines.machine_xid` column to `system_xid` to be consistent with `system_id` passed in APIs. |
+| GitLab Rails app | `%15.10` | Drop `ci_runner_machines.machine_xid` column. |
+| GitLab Rails app | `%15.11` | Remove the ignore rule for `ci_runner_machines.machine_xid` column. |
+
+### Stage 4 - Create runners from the UI
| Component | Milestone | Changes |
|------------------|----------:|---------|
-| GitLab Rails app | `%15.10` | Implement new GraphQL user-authenticated API to create a new runner. |
-| GitLab Rails app | `%15.10` | [Add prefix to newly generated runner authentication tokens](https://gitlab.com/gitlab-org/gitlab/-/issues/383198). |
+| GitLab Rails app | `%15.9` | Implement new GraphQL user-authenticated API to create a new runner. |
+| GitLab Rails app | `%15.9` | [Add prefix to newly generated runner authentication tokens](https://gitlab.com/gitlab-org/gitlab/-/issues/383198). |
+| GitLab Rails app | `%15.10` | Return token and runner ID information from `/runners/verify` REST endpoint. |
+| GitLab Runner | `%15.10` | [Modify register command to allow new flow with glrt- prefixed authentication tokens](https://gitlab.com/gitlab-org/gitlab-runner/-/issues/29613). |
| GitLab Rails app | `%15.10` | Implement UI to create new runner. |
| GitLab Rails app | `%15.10` | GraphQL changes to `CiRunner` type. |
| GitLab Rails app | `%15.10` | UI changes to runner details view (listing of platform, architecture, IP address, etc.) (?) |
+| GitLab Rails app | `%15.11` | Adapt `POST /api/v4/runners` REST endpoint to accept a request from an authorized user with a scope instead of a registration token. |
### Stage 5 - Optional disabling of registration token
| Component | Milestone | Changes |
|------------------|----------:|---------|
-| GitLab Rails app | | Add UI to allow disabling use of registration tokens at project or group level. |
-| GitLab Rails app | `16.0` | Introduce `:disable_runner_registration_tokens` feature flag (enabled by default) to control whether use of registration tokens is allowed. |
-| GitLab Rails app | | Make [`POST /api/v4/runners` endpoint](../../../api/runners.md#register-a-new-runner) permanently return `HTTP 410 Gone` if either `allow_runner_registration_token` setting or `:disable_runner_registration_tokens` feature flag disables registration tokens.<br/>A future v5 version of the API should return `HTTP 404 Not Found`. |
-| GitLab Rails app | | Start refusing job requests that don't include a unique ID, if either `allow_runner_registration_token` setting or `:disable_runner_registration_tokens` feature flag disables registration tokens. |
-| GitLab Rails app | | Hide legacy UI showing registration with a registration token, if `:disable_runner_registration_tokens` feature flag disables registration tokens. |
+| GitLab Rails app | `%15.11` | Adapt `register_{group|project}_runner` permissions to take [application setting](https://gitlab.com/gitlab-org/gitlab/-/issues/386712) in consideration. |
+| GitLab Rails app | `%15.11` | Add UI to allow disabling use of registration tokens at project or group level. |
+| GitLab Rails app | `%15.11` | Introduce `:enforce_create_runner_workflow` feature flag (disabled by default) to control whether use of registration tokens is allowed. |
+| GitLab Rails app | `%15.11` | Make [`POST /api/v4/runners` endpoint](../../../api/runners.md#register-a-new-runner) permanently return `HTTP 410 Gone` if either `allow_runner_registration_token` setting or `:enforce_create_runner_workflow` feature flag disables registration tokens.<br/>A future v5 version of the API should return `HTTP 404 Not Found`. |
+| GitLab Rails app | `%15.11` | Start refusing job requests that don't include a unique ID, if either `allow_runner_registration_token` setting or `:enforce_create_runner_workflow` feature flag disables registration tokens. |
+| GitLab Rails app | `%15.11` | Hide legacy UI showing registration with a registration token, if `:enforce_create_runner_workflow` feature flag disables registration tokens. |
+
+### Stage 6 - Enforcement
+
+| Component | Milestone | Changes |
+|------------------|----------:|---------|
+| GitLab Runner | `%16.0` | Do not allow runner to start if `.runner_system_id` file cannot be written. |
+| GitLab Rails app | `%16.6` | Enable `:enforce_create_runner_workflow` feature flag by default. |
+| GitLab Rails app | `%16.6` | Start reject job requests that don't include `system_id` value. |
-### Stage 6 - Removals
+### Stage 7 - Removals
| Component | Milestone | Changes |
|------------------|----------:|---------|
@@ -361,7 +419,93 @@ using PAT tokens for example - such that every runner is associated with an owne
| GitLab Runner | `17.0` | Remove runner model arguments from `register` command (for example `--run-untagged`, `--tag-list`, etc.) |
| GitLab Rails app | `17.0` | Create database migrations to drop `allow_runner_registration_token` setting columns from `application_settings` and `namespace_settings` tables. |
| GitLab Rails app | `17.0` | Create database migrations to drop:<br/>- `runners_registration_token`/`runners_registration_token_encrypted` columns from `application_settings`;<br/>- `runners_token`/`runners_token_encrypted` from `namespaces` table;<br/>- `runners_token`/`runners_token_encrypted` from `projects` table. |
-| GitLab Rails app | `17.0` | Remove `:disable_runner_registration_tokens` feature flag. |
+| GitLab Rails app | `17.0` | Remove `:enforce_create_runner_workflow` feature flag. |
+
+## FAQ
+
+### Will my runner registration workflow break?
+
+If no action is taken before your GitLab instance is upgraded to 16.6, then your runner registration
+worflow will break.
+For self-managed instances, to continue using the previous runner registration process,
+you can disable the `enforce_create_runner_workflow` feature flag until GitLab 17.0.
+
+To avoid a broken workflow, you need to first create a runner in the GitLab runners admin page.
+After that, you'll need to replace the registration token you're using in your runner registration
+workflow with the obtained runner authentication token.
+
+### What is the new runner registration process?
+
+When the new runner registration process is introduced, you will:
+
+1. Create a runner directly in the GitLab UI.
+1. Receive an authentication token in return.
+1. Use the authentication token instead of the registration token.
+
+This has added benefits such as preserved ownership records for runners, and minimizes
+impact on users.
+The addition of a unique system ID ensures that you can reuse the same authentication token across
+multiple runners.
+For example, in an auto-scaling scenario where a runner manager spawns a runner process with a
+fixed authentication token.
+This ID generates once at the runner's startup, persists in a sidecar file, and is sent to the
+GitLab instance when requesting jobs.
+This allows the GitLab instance to display which system executed a given job.
+
+### What is the estimated timeframe for the planned changes?
+
+- In GitLab 15.10, we plan to implement runner creation directly in the runners administration page,
+ and prepare the runner to follow the new workflow.
+- In GitLab 16.6, we plan to disable registration tokens.
+ For self-managed instances, to continue using
+ registration tokens, you can disable the `enforce_create_runner_workflow` feature flag until
+ GitLab 17.0.
+
+ Previous `gitlab-runner` versions (that don't include the new `system_id` value) will start to be
+ rejected by the GitLab instance;
+- In GitLab 17.0, we plan to completely remove support for runner registration tokens.
+
+### How will the `gitlab-runner register` command syntax change?
+
+The `gitlab-runner register` command will stop accepting registration tokens and instead accept new
+authentication tokens generated in the GitLab runners administration page.
+These authentication tokens are recognizable by their `glrt-` prefix.
+
+Example command for GitLab 15.9:
+
+```shell
+gitlab-runner register
+ --executor "shell" \
+ --url "https://gitlab.com/" \
+ --tag-list "shell,mac,gdk,test" \
+ --run-untagged="false" \
+ --locked="false" \
+ --access-level="not_protected" \
+ --non-interactive \
+ --registration-token="GR1348941C6YcZVddc8kjtdU-yWYD"
+```
+
+In GitLab 16.0, the runner will be created in the UI where some of its attributes can be
+pre-configured by the creator.
+Examples are the tag list, locked status, or access level. These are no longer accepted as arguments
+to `register`. The following example shows the new command:
+
+```shell
+gitlab-runner register
+ --executor "shell" \
+ --url "https://gitlab.com/" \
+ --non-interactive \
+ --registration-token="grlt-2CR8_eVxiioB1QmzPZwa"
+```
+
+### How does this change impact auto-scaling scenarios?
+
+In auto-scaling scenarios such as GitLab Runner Operator or GitLab Runner Helm Chart, the
+registration token is replaced with the authentication token generated from the UI.
+This means that the same runner configuration is reused across jobs, instead of creating a runner
+for each job.
+The specific runner can be identified by the unique system ID that is generated when the runner
+process is started.
## Status
diff --git a/doc/architecture/blueprints/search/code_search_with_zoekt.md b/doc/architecture/blueprints/search/code_search_with_zoekt.md
new file mode 100644
index 00000000000..d0d347f1ff4
--- /dev/null
+++ b/doc/architecture/blueprints/search/code_search_with_zoekt.md
@@ -0,0 +1,305 @@
+---
+status: ongoing
+creation-date: "2022-12-28"
+authors: [ "@dgruzd", "@DylanGriffith" ]
+coach: "@DylanGriffith"
+approvers: [ "@joshlambert", "@changzhengliu" ]
+owning-stage: "~devops::enablement"
+participating-stages: []
+---
+
+# Use Zoekt For code search
+
+## Summary
+
+We will be implementing an additional code search functionality in GitLab that
+is backed by [Zoekt](https://github.com/sourcegraph/zoekt), an open source
+search engine that is specifically designed for code search. Zoekt will be used as
+an API by GitLab and remain an implementation detail while the user interface
+in GitLab will not change much except for some new features made available by
+Zoekt.
+
+This will be rolled out in phases to ensure that the system will actually meet
+our scaling and cost expectations and will run alongside code search backed by
+Elasticsearch until we can be sure it is a viable replacement. The first step
+will be making it available for `gitlab-org` for internal and expanding
+customer by customer based on customer interest.
+
+## Motivation
+
+GitLab code search functionality today is backed by Elasticsearch.
+Elasticsearch has proven useful for other types of search (issues, merge
+requests, comments and so-on) but is by design not a good choice for code
+search where users expect matches to be precise (ie. no false positives) and
+flexible (e.g. support
+[substring matching](https://gitlab.com/gitlab-org/gitlab/-/issues/325234)
+and
+[regexes](https://gitlab.com/gitlab-org/gitlab/-/issues/4175)). We have
+[investigated our options](https://gitlab.com/groups/gitlab-org/-/epics/7404)
+and [Zoekt](https://github.com/sourcegraph/zoekt) is pretty much the only well
+maintained open source technology that is suited to code search. Based on our
+research we believe it will be better to adopt a well maintained open source
+database than attempt to build our own. This is mostly due to the fact that our
+research indicates that the fundamental architecture of Zoekt is what we would
+implement again if we tried to implement something ourselves.
+
+Our
+[early benchmarking](https://gitlab.com/gitlab-org/gitlab/-/issues/370832#note_1183611955)
+suggests that Zoekt will be viable at our scale, but we feel strongly
+that investing in building a beta integration with Zoekt and rolling it out
+group by group on GitLab.com will provide better insights into scalability and
+cost than more accurate benchmarking efforts. It will also be relatively low
+risk as it will be rolled out internally first and later rolled out to
+customers that wish to participate in the trial.
+
+### Goals
+
+The main goals of this integration will be to implement the following highly
+requested improvements to code search:
+
+1. [Exact match (substring match) code searches in Advanced Search](https://gitlab.com/gitlab-org/gitlab/-/issues/325234)
+1. [Support regular expressions with Advanced Global Search](https://gitlab.com/gitlab-org/gitlab/-/issues/4175)
+1. [Support multiple line matches in the same file](https://gitlab.com/gitlab-org/gitlab/-/issues/668)
+
+The initial phases of the rollout will be designed to catch and resolve scaling
+or infrastructure cost issues as early as possible so that we can pivot early
+before investing too much in this technology if it is not suitable.
+
+### Non-Goals
+
+The following are not goals initially but could theoretically be built upon
+this solution:
+
+1. Improving security scanning features by having access to quickly perform
+ regex scans across many repositories
+1. Saving money on our search infrastructure - this may be possible with
+ further optimizations, but initial estimates suggest the cost is similar
+1. AI/ML features of search used to predict what users might be interested in
+ finding
+1. Code Intelligence and Navigation - likely code intelligence and navigation
+ features should be built on structured data rather than a trigram index but
+ regex based searches (using Zoekt) may be a suitable fallback for code which
+ does not have structured metadata enabled or dynamic languages where static
+ analysis is not very accurate. Zoekt in particular may not be well suited
+ initially, despite existing symbol extraction using ctags, because ctags
+ symbols may not contain enough data for accurate navigation and Zoekt
+ doesn't undersand dependencies which would be necessary for cross-project
+ navigation.
+
+## Proposal
+
+An
+[initial implementation of a Zoekt integration](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/105049)
+was created to demonstrate the feasibility of using Zoekt as a drop-in
+replacement for Elasticsearch code searches. This blueprint will extend on all
+the details needed to provide a minimum viable change as well steps needed to
+scale this to a larger customer rollout on GitLab.com.
+
+## Design and implementation details
+
+### User Experience
+
+When a user performs an advanced search on a group or project that is part
+of the Zoekt rollout we will present a toggle somewhere in the UI to change
+to "precise search" (or some other UX TBD) which switches them from
+Elasticsearch to Zoekt. Early user feedback will help us assess the best way
+to present these choices to users and ultimately we will want to remove the
+Elasticsearch option if we find Zoekt is a suitable long term option.
+
+### Indexing
+
+Similar to our Elasticsearch integration, GitLab will notify Zoekt every time
+there are updates to a repository. Zoekt, unlike Elasticsearch, is designed to
+clone and index Git repositories so we will simply notify Zoekt of the URL of
+the repository that has changed and it will update its local copy of the Git
+repo and then update its local index files. The Zoekt side of this logic will
+be implemented in a new server-side indexing endpoint we add to Zoekt which is
+currently in
+[an open Pull request](https://github.com/sourcegraph/zoekt/pull/496).
+While the details of
+this pull request are still being debated, we may choose to deploy a fork with
+the functionality we need, but our strongest intention is not to maintain a
+fork of Zoekt and the maintainers have already expressed they are open to this
+new functionality.
+
+The rails side of the integration will be a Sidekiq worker that is scheduled
+every time there is an update to a repository and it will simply call this
+`/index` endpoint in Zoekt. This will also need to generate a one-time token
+that can allow Zoekt to clone a private repository.
+
+```mermaid
+sequenceDiagram
+ participant user as User
+ participant gitlab_git as GitLab Git
+ participant gitlab_sidekiq as GitLab Sidekiq
+ participant zoekt as Zoekt
+ user->>gitlab_git: git push git@gitlab.com:gitlab-org/gitlab.git
+ gitlab_git->>gitlab_sidekiq: ZoektIndexerWorker.perform_async(278964)
+ gitlab_sidekiq->>zoekt: POST /index {"RepoUrl":"https://zoekt:SECRET_TOKEN@gitlab.com/gitlab-org/gitlab.git","RepoId":278964}'
+ zoekt->>gitlab_git: git clone https://zoekt:SECRET_TOKEN@gitlab.com/gitlab-org/gitlab.git
+```
+
+The Sidekiq worker can leverage de-duplication based on the `project_id`.
+
+Zoekt supports indexing multiple projects we'll likely need to, eventually,
+allow a way for users to configure additional branches (beyond the default
+branch) and this will need to be sent to Zoekt. We will need to decide if these
+branch lists are sent every time we index the project or only when they change
+configuration.
+
+There may be race conditions with multiple Zoekt processes indexing the same
+repo at the same time. For this reason we should implement a locking mechanism
+somewhere to ensure we are only indexing 1 project in 1 place at a time. We
+could make use of the same Redis locking we use for indexing projects in
+Elasticsearch.
+
+### Searching
+
+Searching will be implemented using the `/api/search` functionality in
+Zoekt. There is also
+[an open PR to fix this endpoint in Zoekt](https://github.com/sourcegraph/zoekt/pull/506),
+and again we may consider working from a fork until this is fixed. GitLab will
+prepend all searches with the appropriate filter for repositories based on the
+user's search context (group or project) in the same way we do for
+Elasticsearch. For Zoekt this will be implemented as a query string regex that
+matches all the searched repositories.
+
+### Zoekt infrastructure
+
+Each Zoekt node will need to run a
+[zoekt-dynamic-indexserver](https://github.com/sourcegraph/zoekt/pull/496) and
+a
+[zoekt-webserver](https://github.com/sourcegraph/zoekt/blob/main/cmd/zoekt-webserver/main.go).
+These are both webservers with different responsibilities. Considering that the
+Zoekt indexing process needs to keep a full clone of the bare repo
+([unless we come up with a better option](https://gitlab.com/gitlab-org/gitlab/-/issues/384722))
+these bare repos will be stored on spinning disks to save space. These are only
+used as an intermediate step to generate the actual `.zoekt` index files which
+will be stored on an SSD for fast searches. These web servers need to run on
+the same node as they access the same files. The `zoekt-dynamic-indexserver` is
+responsible for writing the `.zoekt` index files. The `zoekt-webserver` is
+responsible for responding to searches that it performs by reading these
+`.zoekt` index files.
+
+### Rollout strategy
+
+Initially Zoekt code search will only be available to `gitlab-org`. After that
+we'll start rolling it out to specific customers that have requested better
+code search experience. As we learn about scaling and make improvements we will
+gradually roll it out to all licensed groups on GitLab.com. We will use a
+similar approach to Elasticsearch for keeping track of which groups are indexed
+and which are not. This will be based on a new table `zoekt_indexed_namespaces`
+with a `namespace_id` reference. We will only allow rolling out to top level
+namespaces to simplify the logic of checking for all layers of group
+inheritance. Once we've rolled out to all licensed groups we'll enable logic to
+automatically enroll newly licensed groups. This table also may be a place to
+store per-namespace sharding and replication data as described below.
+
+### Sharding and replication strategy
+
+Zoekt does not have any inbuilt sharding, and we expect that we'll need
+multiple Zoekt servers to reach the scale to provide search functionality to
+all of GitLab licensed customers.
+
+There are 2 clear ways to implement sharding:
+
+1. Build it on top of, or in front of Zoekt, as an independent component. Building
+ all the complexities of a distributed database into Zoekt is not likely to
+ be a good direction for the project so most likely this would be an
+ independent piece of infrastructure that proxied requests to the correct
+ shard.
+1. Manage the shards inside GitLab. This would be an application layer in
+ GitLab which chooses the correct shard to send indexing and search requests
+ to.
+
+Likewise, there are a few ways to implement replication:
+
+1. Server-side where Zoekt replicas are aware of other Zoekt replicas and they
+ stream updates from some primary to remain in sync
+1. Client-side replication where clients send indexing requests to all replicas
+ and search requests to any replica
+
+We plan to implement sharding inside GitLab application but replication may be
+best served at the level of the filesystem of Zoekt servers rather than sending
+duplicated updates from GitLab to all replicas. This could be some process on
+Zoekt servers that monitors for changes to the `.zoekt` files in a specific
+directory and syncs those updates to the replicas. This will need to be
+slightly more sophisticated than `rsync` because the files are constantly
+changing and files may be getting deleted while the sync is happening so we
+would want to be syncing the updates in batches somehow without slowing down
+indexing.
+
+Implementing sharding in GitLab simplifies the additional infrastructure
+components that need to be deployed and allows more flexibility to control our
+rollout to many customers alongside our rollout of multiple shards.
+
+Implementing syncing from primary -> replica on Zoekt nodes at the filesystem
+level optimizes that overall resource usage. We only need to sync the index
+files to replicas as the bare repo is just a cache. This saves on:
+
+1. Disk space on replicas
+1. CPU usage on replicas as it does not need to rebuild the index
+1. Load on Gitaly to clone the repos
+
+We plan to defer the implementation of these high availability aspects until
+later, but a preliminary plan would be:
+
+1. GitLab is configured with a pool of Zoekt servers
+1. GitLab assigns groups randomly a Zoekt primary server
+1. There will also be Zoekt replica servers
+1. Periodically Zoekt primary servers will sync their `.zoekt` index files to
+ their respective replicas
+1. There will need to be some process by which to promote a replica to a
+ primary if the primary is having issues. We will be using Consul for
+ keeping track of which is the primary and which are the replicas.
+1. When indexing a project GitLab will queue a Sidekiq job to update the index
+ on the primary
+1. When searching we will randomly select one of the Zoekt primaries or replica
+ servers for the group being searched. We don't care which is "more up to
+ date" as code search will be "eventually consistent" and all reads may read
+ slightly out of date indexes. We will have a target of maximum latency of
+ index updates and may consider removing nodes from rotation if they are too
+ far out of date.
+1. We will shard everything by top level group as this ensures group search can
+ always search a single Zoekt server. Aggregation may be possible for global
+ searches at some point in future if this turns out to be important. Smaller
+ self-managed instances may use a single Zoekt server allowing global
+ searches to work without any aggregation being implemented. Depending on our
+ largest group sizes and scaling limitations of a single node Zoekt server we
+ may consider implementing an approach where a group can be assigned multiple
+ shards.
+
+The downside of the chosen path will be added complexity of managing all these
+Zoekt servers from GitLab when compared with a "proxy" layer outside of GitLab
+that is managing all of these shards. We will consider this decision a work in
+progress and reassess if it turns out to add too much complexity to GitLab.
+
+#### Sharding proposal using GitLab `::Zoekt::Shard` model
+
+This is already implemented as the `::Zoekt::IndexedNamespace`
+implements a many-to-many relationship between namespaces and shards.
+
+#### Replication and service discovery using Consul
+
+If we plan to replicate at the Zoekt node level as described above we need to
+change our data model to use a one-to-many relationship from `zoekt_shards ->
+namespaces`. This means making the `namespace_id` column unique in
+`zoekt_indexed_namespaces`. Then we need to implement a service discovery
+approach where the `index_url` always points at a primary Zoekt node and the
+`search_url` is a DNS record with N replicas and the primary. We then choose
+randomly from `search_url` records when searching.
+
+### Iterations
+
+1. Make available for `gitlab-org`
+1. Improve monitoring
+1. Improve performance
+1. Make available for select customers
+1. Implement sharding
+1. Implement replication
+1. Make available to many more licensed groups
+1. Implement automatic (re)balancing of shards
+1. Estimate costs for rolling out to all licensed groups and decide if it's worth it or if we need to optimize further or adjust our plan
+1. Rollout to all licensed groups
+1. Improve performance
+1. Assess costs and decide whether we should roll out to all free customers
diff --git a/doc/architecture/blueprints/work_items/index.md b/doc/architecture/blueprints/work_items/index.md
index 058282ec2b7..2c854ecea59 100644
--- a/doc/architecture/blueprints/work_items/index.md
+++ b/doc/architecture/blueprints/work_items/index.md
@@ -70,6 +70,24 @@ All Work Item types share the same pool of predefined widgets and are customized
\* status is not currently a widget, but a part of the root work item, similar to title
+### Work item relationships
+
+Work items can be related to other work items in a number of different ways:
+
+- Parent: A direct ancestor to the current work item, whose completion relies on completing this work item.
+- Child: A direct descendant of the current work item, which contributes to this work item's completion.
+- Blocked by: A work item preventing the completion of the current work item.
+- Blocks: A work item whose completion is blocked by the current work item.
+- Related: A work item that is relevant to the subject of the current work item, but does not directly contribute to or block the completion of this work item.
+
+#### Hierarchy
+
+Parent-child relationships form the basis of **hierarchy** in work items. Each work item type has a defined set of types that can be parents or children of that type.
+
+As types expand, and parent items have their own parent items, the hierarchy capability can grow exponentially.
+
+[Pajamas](https://design.gitlab.com/objects/work-item#hierarchy) documents how to display hierarchies depending on context.
+
### Work Item view
The new frontend view that renders Work Items of any type using global Work Item `id` as an identifier.
@@ -119,6 +137,7 @@ Work Item architecture is designed with making all the features for all the type
### Links
+- [Work items in Pajamas Design System](https://design.gitlab.com/objects/work-item)
- [Work items initiative epic](https://gitlab.com/groups/gitlab-org/-/epics/6033)
- [Tasks roadmap](https://gitlab.com/groups/gitlab-org/-/epics/7103?_gl=1*zqatx*_ga*NzUyOTc3NTc1LjE2NjEzNDcwMDQ.*_ga_ENFH3X7M5Y*MTY2MjU0MDQ0MC43LjEuMTY2MjU0MDc2MC4wLjAuMA..)
- [Work Item "Vision" Prototype](https://gitlab.com/gitlab-org/gitlab/-/issues/368607)
diff --git a/doc/architecture/index.md b/doc/architecture/index.md
index 689ff2afaa0..2467ba33fae 100644
--- a/doc/architecture/index.md
+++ b/doc/architecture/index.md
@@ -8,3 +8,14 @@ toc: false
- [Architecture at GitLab](https://about.gitlab.com/handbook/engineering/architecture/)
- [Architecture Workflow](https://about.gitlab.com/handbook/engineering/architecture/workflow/)
+
+## Contributing
+
+At GitLab, everyone can contribute, including to our architecture blueprints.
+
+If you would like to contribute to any of these blueprints, feel free to:
+
+1. Go to the [source files in the repository](https://gitlab.com/gitlab-org/gitlab/tree/master/doc/architecture/blueprints)
+ and select the blueprint you wish to contribute to.
+1. [Create a merge request](../development/contributing/merge_request_workflow.md).
+1. `@` message both an author and a coach assigned to the blueprint, as listed below.