Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2021-11-19 03:09:40 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2021-11-19 03:09:40 +0300
commit0aa559415ca12dd0f0efd226873dd2a8e5313e3d (patch)
tree41313e784a73a2f1f173cbaeee4eb106435ed35e /doc/architecture
parent5d3b84924481aa7d7c67f712c547edcd8f04fe52 (diff)
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc/architecture')
-rw-r--r--doc/architecture/blueprints/object_storage/index.md220
1 files changed, 220 insertions, 0 deletions
diff --git a/doc/architecture/blueprints/object_storage/index.md b/doc/architecture/blueprints/object_storage/index.md
new file mode 100644
index 00000000000..a79374d60bd
--- /dev/null
+++ b/doc/architecture/blueprints/object_storage/index.md
@@ -0,0 +1,220 @@
+---
+stage: none
+group: unassigned
+comments: false
+description: 'Object storage: direct_upload consolidation - architecture blueprint.'
+---
+
+# Object storage: `direct_upload` consolidation
+
+## Abstract
+
+GitLab stores three classes of user data: database records, Git
+repositories, and user-uploaded files (which are referred to as
+file storage throughout the blueprint).
+
+The user and contributor experience for our file
+storage has room for significant improvement:
+
+- Initial GitLab setup experience requires creation and setup of 13
+ buckets, instead of just 1.
+- Features using file storage require contributors to think about both local
+ storage and object storage, which leads to friction and
+ complexity. This often results in broken features and security issues.
+- Contributors who work on file storage often also have to write code
+ for Workhorse, Omnibus, and cloud native GitLab (CNG).
+
+## Problem definition
+
+Object storage is a fundamental component of GitLab, providing the
+underlying implementation for shared, distributed, highly-available
+(HA) file storage.
+
+Over time, we have built support for object storage across the
+application, solving specific problems in a [multitude of
+iterations](https://about.gitlab.com/company/team/structure/working-groups/object-storage/#company-efforts-on-uploads). This
+has led to increased complexity across the board, from development
+(new features and bug fixes) to installation:
+
+- New GitLab installations require the creation and configuration of
+ several object storage buckets instead of just one, as each group of
+ features requires its own. This has an impact on the installation
+ experience and new feature adoption, and takes us further away from
+ boring solutions.
+- The release of cloud native GitLab required the removal of NFS
+ shared storage and the development of direct upload, a feature that
+ was expanded, milestone after milestone, to several type of uploads,
+ but never enabled globally.
+- Today, GitLab supports both local storage and object storage. Local
+ storage only works on single box installations or with a NFS, which
+ [we no longer recommend](../../../administration/nfs.md) to our
+ users and is no longer in use on GitLab.com.
+- Understanding all the moving parts and the flow is extremely
+ complicated: we have CarrierWave, Fog, Golang S3/Azure SDKs, all
+ being used, and that complicates testing as well.
+- Fog and CarrierWave are not maintained to the level of the native
+ SDKs (for example, AWS S3 SDK), so we have to maintain or monkey
+ patch those tools to support requested customer features
+ (for example, [issue #242245](https://gitlab.com/gitlab-org/gitlab/-/issues/242245))
+ that would normally be "free".
+- In many cases, we copy around object storage files needlessly
+ (for example, [issue #285597](https://gitlab.com/gitlab-org/gitlab/-/issues/285597)).
+ Large files (LFS, packages, and so on) are slow to finalize or don't work
+ at all as a result.
+
+## Improvements over the current situation
+
+The following is a brief description of the main directions we can take to
+remove the pain points affecting our object storage implementation.
+
+This is also available as [a YouTube
+video](https://youtu.be/X9V_w8hsM8E) recorded for the [Object Storage
+Working
+Group](https://about.gitlab.com/company/team/structure/working-groups/object-storage/).
+
+### Simplify GitLab architecture by shipping MinIO
+
+In the beginning, object storage support was a Premium feature, not
+part of our CE distribution. Because of that, we had to support both
+local storage and object storage.
+
+With local storage, there is the assumption of a shared storage
+between components. This can be achieved by having a single box
+installation, without HA, or with a NFS, which [we no longer
+recommend](../../../administration/nfs.md).
+
+We have a testing gap on object storage. It also requires Workhorse
+and MinIO, which are not present in our pipelines, so too much is
+replaced by a mock implementation. Furthermore, the presence of a
+shared disk, both in CI and in local development, often hides broken
+implementations until we deploy on an HA environment.
+
+Shipping MinIO as part of the product will reduce the differences
+between a cloud and a local installation, standardizing our file
+storage on a single technology.
+
+The removal of local disk operations will reduce the complexity of
+development as well as mitigate several security attack vectors as
+we no longer write user-provided data on the local storage.
+
+It will also reduce human errors as we will always run a local object
+storage in development mode and any local file disk access should
+raise a red flag during the merge request review.
+
+This effort is described in [this epic](https://gitlab.com/groups/gitlab-org/-/epics/6099).
+
+### Enable direct upload by default on every upload
+
+Because every group of features requires its own bucket, we don't have
+direct upload enabled everywhere. Contributing a new upload requires
+coding it in both Ruby on Rails and Go.
+
+Implementing a new feature that does not yet have a dedicated bucket
+requires the developer to also create a merge request in Omnibus
+and CNG, as well as coordinate with SREs to configure the new bucket
+for our own environments.
+
+This also slows down feature adoptions, because our users need to
+reconfigure GitLab and prepare a new bucket in their
+infrastructure. It also makes the initial installation more complex
+feature after feature.
+
+Implementing a direct upload by default, with a
+[consolidated object storage configuration](../../../administration/object_storage.md#consolidated-object-storage-configuration)
+will reduce the number of merge requests needed to ship a new feature
+from four to only one. It will also remove the need for SRE
+intervention as the bucket will always be the same.
+
+This will simplify our development and review processes, as well as
+the GitLab configuration file. And every user will immediately have
+access to new features without infrastructure chores.
+
+### Simplify object storage code
+
+Our implementation is built on top of a 3rd-party framework where
+every object storage client is a 3rd-party library. Unfortunately some
+of them are unmaintained. [We have customers who cannot push 5GB Git
+LFS objects](https://gitlab.com/gitlab-org/gitlab/-/issues/216442),
+but with such a vital feature implemented in 3rd-party libraries we
+are slowed down in fixing it, and we also rely on external maintainers
+to merge and release fixes.
+
+Before the introduction of direct upload, using the
+[CarrierWave](https://github.com/carrierwaveuploader/carrierwave)
+library, _"a gem that provides a simple and extremely flexible way to
+upload files from Ruby applications."_, was the boring solution.
+However this is no longer our use-case, as we upload files from
+Workhorse, and we had to [patch CarrierWave's
+internals](https://gitlab.com/gitlab-org/gitlab/-/issues/285597#note_452696638)
+to support direct upload.
+
+A brief proposal covering CarrierWave removal and a new streamlined
+internal upload API is described
+[in this issue comment](https://gitlab.com/gitlab-org/gitlab/-/issues/213288#note_325358026).
+
+Ideally, we wouldn't need to duplicate object storage clients in Go
+and Ruby. By removing CarrierWave, we can make use of the officially
+supported native clients when the provider S3 compatibility level is
+not sufficient.
+
+## Iterations
+
+In this section we list some possible iterations. This is not
+intended to be the final roadmap, but is a conversation started for the
+Object Storage Working Group.
+
+1. Create a new catchall bucket and a unified internal API for
+ authorization without CarrierWave.
+1. Ship MinIO with Omnibus (CNG images already include it).
+1. Expand GitLab-QA to cover all the supported configurations.
+1. Deprecate local disk access.
+1. Deprecate configurations with multiple buckets.
+1. Implement a bucket-to-bucket migration.
+1. Migrate the current CarrierWave uploads to the new implementation.
+1. On the next major release: Remove support for local disk access and
+ configurations with multiple buckets.
+
+### Benefits of the current iteration plan
+
+The current plan is designed to provide tangible benefits from the
+first step.
+
+With the introduction of the catchall bucket, every upload currently
+not subject to direct upload will get its benefits, and new features
+could be shipped with a single merge request.
+
+Shipping MinIO with Omnibus will allow us to default new installations
+to object storage, and Omnibus could take care of creating
+buckets. This will simplify HA installation outside of Kubernetes.
+
+Then we can migrate each CarrierWave uploader to the new
+implementation, up to a point where GitLab installation will only
+require one bucket.
+
+## Additional reading materials
+
+- [Uploads development documentation: The problem description](../../../development/uploads.md#the-problem-description).
+- [Speed up the monolith, building a smart reverse proxy in Go](https://archive.fosdem.org/2020/schedule/event/speedupmonolith/): a presentation explaining a bit of workhorse history and the challenge we faced in releasing the first cloud-native installation.
+- [Object Storage improvements epic](https://gitlab.com/groups/gitlab-org/-/epics/483).
+- We are moving to GraphQL API, but [we do not support direct upload](https://gitlab.com/gitlab-org/gitlab/-/issues/280819).
+
+## Who
+
+Proposal:
+
+<!-- vale gitlab.Spelling = NO -->
+
+| Role | Who |
+|--------------------------------|-------------------------|
+| Author | Alessio Caiazza |
+| Architecture Evolution Coach | Gerardo Lopez-Fernandez |
+| Engineering Leader | Marin Jankovski |
+| Domain Expert / Object storage | Stan Hu |
+| Domain Expert / Security | Joern Schneeweisz |
+
+DRIs:
+
+The DRI for this blueprint is the [Object Storage Working
+Group](https://about.gitlab.com/company/team/structure/working-groups/object-storage/).
+
+<!-- vale gitlab.Spelling = YES -->