diff options
Diffstat (limited to 'doc/architecture/blueprints/cells/index.md')
-rw-r--r-- | doc/architecture/blueprints/cells/index.md | 152 |
1 files changed, 104 insertions, 48 deletions
diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 0e93b9d5d3b..28414f9b68c 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -1,5 +1,5 @@ --- -status: accepted +status: ongoing creation-date: "2022-09-07" authors: [ "@ayufan", "@fzimmer", "@DylanGriffith", "@lohrc", "@tkuah" ] coach: "@ayufan" @@ -18,9 +18,7 @@ Cells is a new architecture for our software as a service platform. This archite For more information about Cells, see also: -- [Glossary](glossary.md) -- [Goals](goals.md) -- [Cross-section impact](impact.md) +- [Goals, Glossary and Requirements](goals.md) ## Work streams @@ -101,6 +99,10 @@ The first 2-3 quarters are required to define a general split of data and build The purpose is to perform a targeted decomposition of `users` and `projects`, because `projects` will be stored locally in the Cell. +1. **User can create Organization on Cell 2.** + + The purpose is to create Organizations that are isolated from each other. + 1. **User can change profile avatar that is shared in cluster.** The purpose is to fix global uploads that are shared in cluster. @@ -166,6 +168,11 @@ For example: The routing service needs to be able to discover and monitor the health of all Cells. +1. **User can use single domain to interact with many Cells.** + + The routing service will intelligently route all requests to Cells based on the resource being + accessed versus the Cell containing the data. + 1. **Router endpoints classification.** The stateless routing service will fetch and cache information about endpoints from one of the Cells. @@ -258,28 +265,30 @@ One iteration describes one quarter's worth of work. - Data access layer: Initial Admin Area settings are shared across cluster. - Essential workflows: Allow to share cluster-wide data with database-level data access layer -1. [Iteration 2](https://gitlab.com/groups/gitlab-org/-/epics/9813) - FY24Q2 - In progress +1. [Iteration 2](https://gitlab.com/groups/gitlab-org/-/epics/9813) - Expected delivery: 16.2 FY24Q2 | Actual delivery: 16.4 FY24Q3 - In progress - Essential workflows: User accounts are shared across cluster. - Essential workflows: User can create Group. -1. [Iteration 3](https://gitlab.com/groups/gitlab-org/-/epics/10997) - FY24Q3 - Planned +1. [Iteration 3](https://gitlab.com/groups/gitlab-org/-/epics/10997) - Expected delivery: 16.7 FY24Q4 - Planned - Essential workflows: User can create Project. - Routing: Technology. - - Data access layer: Evaluate the efficiency of database-level access vs. API-oriented access layer + - Routing: Cell discovery. + - Data access layer: Evaluate the efficiency of database-level access vs. API-oriented access layer. + - Data access layer: Data access layer. -1. [Iteration 4](https://gitlab.com/groups/gitlab-org/-/epics/10998) - FY24Q4 +1. [Iteration 4](https://gitlab.com/groups/gitlab-org/-/epics/10998) - Expected delivery: 16.10 FY25Q1 - Planned - - Essential workflows: User can push to Git repository. - - Essential workflows: User can create issue, merge request, and merge it after it is green. + - Essential workflows: User can create organization on Cell 2. - Data access layer: Cluster-unique identifiers. - - Routing: Cell discovery. - - Routing: Router endpoints classification. - - Cell deployment: Extend GitLab Dedicated to support GCP + - Routing: User can use single domain to interact with many Cells. + - Cell deployment: Extend GitLab Dedicated to support GCP. -1. Iteration 5 - FY25Q1 +1. Iteration 5..N - starting FY25Q1 + - Essential workflows: User can push to Git repository. + - Essential workflows: User can create issue, merge request, and merge it after it is green. - Essential workflows: User can run CI pipeline. - Essential workflows: Instance-wide settings are shared across cluster. - Essential workflows: User can change profile avatar that is shared in cluster. @@ -287,22 +296,13 @@ One iteration describes one quarter's worth of work. - Essential workflows: User can manage Group and Project members. - Essential workflows: User can manage instance-wide runners. - Essential workflows: User is part of Organization and can only see information from the Organization. + - Routing: Router endpoints classification. - Routing: GraphQL and other ambiguous endpoints. - Data access layer: Allow to share cluster-wide data with database-level data access layer. - Data access layer: Cluster-wide deletions. - - Data access layer: Data access layer. - Data access layer: Database migrations. -1. Iteration 6 - FY25Q2 - - TBD - -1. Iteration 7 - FY25Q3 - - TBD - -1. Iteration 8 - FY25Q4 - - TBD - -## Technical Proposals +## Technical proposals The Cells architecture has long lasting implications to data processing, location, scalability and the GitLab architecture. This section links all different technical proposals that are being evaluated. @@ -315,34 +315,36 @@ This section links all different technical proposals that are being evaluated. The Cells architecture will impact many features requiring some of them to be rewritten, or changed significantly. Below is a list of known affected features with preliminary proposed solutions. -- [Cells: Admin Area](cells-feature-admin-area.md) -- [Cells: Backups](cells-feature-backups.md) -- [Cells: CI Runners](cells-feature-ci-runners.md) -- [Cells: Container Registry](cells-feature-container-registry.md) -- [Cells: Contributions: Forks](cells-feature-contributions-forks.md) -- [Cells: Database Sequences](cells-feature-database-sequences.md) -- [Cells: Data Migration](cells-feature-data-migration.md) -- [Cells: Explore](cells-feature-explore.md) -- [Cells: Git Access](cells-feature-git-access.md) -- [Cells: Global Search](cells-feature-global-search.md) -- [Cells: GraphQL](cells-feature-graphql.md) -- [Cells: Organizations](cells-feature-organizations.md) -- [Cells: Secrets](cells-feature-secrets.md) -- [Cells: Snippets](cells-feature-snippets.md) -- [Cells: User Profile](cells-feature-user-profile.md) -- [Cells: Your Work](cells-feature-your-work.md) +- [Cells: Admin Area](impacted_features/admin-area.md) +- [Cells: Backups](impacted_features/backups.md) +- [Cells: CI Runners](impacted_features/ci-runners.md) +- [Cells: Container Registry](impacted_features/container-registry.md) +- [Cells: Contributions: Forks](impacted_features/contributions-forks.md) +- [Cells: Database Sequences](impacted_features/database-sequences.md) +- [Cells: Data Migration](impacted_features/data-migration.md) +- [Cells: Explore](impacted_features/explore.md) +- [Cells: Git Access](impacted_features/git-access.md) +- [Cells: Global Search](impacted_features/global-search.md) +- [Cells: GraphQL](impacted_features/graphql.md) +- [Cells: Organizations](impacted_features/organizations.md) +- [Cells: Personal Namespaces](impacted_features/personal-namespaces.md) +- [Cells: Secrets](impacted_features/secrets.md) +- [Cells: Snippets](impacted_features/snippets.md) +- [Cells: User Profile](impacted_features/user-profile.md) +- [Cells: Your Work](impacted_features/your-work.md) ### Impacted features: Placeholders The following list of impacted features only represents placeholders that still require work to estimate the impact of Cells and develop solution proposals. -- [Cells: Agent for Kubernetes](cells-feature-agent-for-kubernetes.md) -- [Cells: GitLab Pages](cells-feature-gitlab-pages.md) -- [Cells: Personal Access Tokens](cells-feature-personal-access-tokens.md) -- [Cells: Personal Namespaces](cells-feature-personal-namespaces.md) -- [Cells: Router Endpoints Classification](cells-feature-router-endpoints-classification.md) -- [Cells: Schema changes (Postgres and Elasticsearch migrations)](cells-feature-schema-changes.md) -- [Cells: Uploads](cells-feature-uploads.md) +- [Cells: Agent for Kubernetes](impacted_features/agent-for-kubernetes.md) +- [Cells: CI/CD Catalog](impacted_features/ci-cd-catalog.md) +- [Cells: Data pipeline ingestion](impacted_features/data-pipeline-ingestion.md) +- [Cells: GitLab Pages](impacted_features/gitlab-pages.md) +- [Cells: Personal Access Tokens](impacted_features/personal-access-tokens.md) +- [Cells: Router Endpoints Classification](impacted_features/router-endpoints-classification.md) +- [Cells: Schema changes (Postgres and Elasticsearch migrations)](impacted_features/schema-changes.md) +- [Cells: Uploads](impacted_features/uploads.md) - ... ## Frequently Asked Questions @@ -367,6 +369,59 @@ For example, users on GitLab Dedicated don't have to have a different and unique Up until iteration 3, Cells communicate with each other only via a shared database that contains common data. In iteration 4 we are going to evaluate the option of Cells calling each other via API to provide more isolation and reliability. +### How are Cells provisioned? + +The GitLab.com cluster of Cells will use GitLab Dedicated instances. +Once a GitLab Dedicated instance gets provisioned it could join the GitLab.com cluster and become a Cell. +One requirement will be that the GitLab Dedicated instance does not contain any prior data. + +To reach shared resources, Cells will use [Private Service Connect](https://cloud.google.com/vpc/docs/private-service-connect). + +See also the [design discussion](https://gitlab.com/gitlab-org/gitlab/-/issues/396641). + +### What is a Cells topology? + +See the [design discussion](https://gitlab.com/gitlab-org/gitlab/-/issues/396641). + +### How are users of an Organization routed to the correct Cell? + +TBD + +### How do users authenticate with Cells and Organizations? + +See the [design discussion](https://gitlab.com/gitlab-org/gitlab/-/issues/395736). + +### How are Cells rebalanced? + +TBD + +### How can Cells implement disaster recovery capabilities? + +TBD + +### How do I decide whether to move my feature to the cluster, Cell or Organization level? + +By default, features are required to be scoped to the Organization level. Any deviation from that rule should be validated and approved by Tenant Scale. + +The design goals of the Cells architecture describe that [all Cells are under a single domain](goals.md#all-cells-are-under-a-single-gitlabcom-domain) and as such, Cells are invisible to the user: + +- Cell-local features should be limited to those related to managing the Cell, but never be a feature where the Cell semantic is exposed to the customer. +- The Cells architecture wants to freely control the distribution of Organization and customer data across Cells without impacting users when data is migrated. + +Cluster-wide features are strongly discouraged because: + +- They might require storing a substantial amount of data cluster-wide which decreases [scalability headroom](goals.md#provides-100x-headroom). +- They might require implementation of non-trivial [data aggregation](goals.md#aggregation-of-cluster-wide-data) that reduces resilience to [single node failure](goals.md#high-resilience-to-a-single-cell-failure). +- They are harder to build due to the need of being able to run [mixed deployments](goals.md#cells-running-in-mixed-deployments). Cluster-wide features need to take this into account. +- They might affect our ability to provide an [on-premise like experience on GitLab.com](goals.md#on-premise-like-experience). +- Some features that are expected to be cluster-wide might in fact be better implemented using federation techniques that use trusted intra-cluster communication using the same user identity. User Profile is shared across the cluster. +- The Cells architecture limits what services can be considered cluster-wide. Services that might initially be cluster-wide are still expected to be split in the future to achieve full service isolation. No feature should be built to depend on such a service (like Elasticsearch). + +### Will Cells use the [reference architecture for 50,000 users](../../../administration/reference_architectures/50k_users.md)? + +The infrastructure team will properly size Cells depending on the load. +The Tenant Scale team sees an opportunity to use GitLab Dedicated as a base for Cells deployment. + ## Decision log - 2022-03-15: Google Cloud as the cloud service. For details, see [issue 396641](https://gitlab.com/gitlab-org/gitlab/-/issues/396641#note_1314932272). @@ -378,3 +433,4 @@ In iteration 4 we are going to evaluate the option of Cells calling each other v - [Database group investigation](https://about.gitlab.com/handbook/engineering/development/enablement/data_stores/database/doc/root-namespace-sharding.html) - [Shopify Pods architecture](https://shopify.engineering/a-pods-architecture-to-allow-shopify-to-scale) - [Opstrace architecture](https://gitlab.com/gitlab-org/opstrace/opstrace/-/blob/main/docs/architecture/overview.md) +- [Adding Diagrams to this blueprint](diagrams/index.md) |