diff options
author | Gabriel Mazetto <brodock@gmail.com> | 2019-03-11 19:38:19 +0300 |
---|---|---|
committer | Gabriel Mazetto <brodock@gmail.com> | 2019-03-15 06:34:33 +0300 |
commit | 823695ee37113e84dabffee7d8f310bca294b234 (patch) | |
tree | 8d40fe91bdeb76422f482b272ad3daa05e0e2122 /doc/administration/repository_storage_types.md | |
parent | a59b7cee0afc152efba6aa4c430c14c0b038107a (diff) |
Document Storage Rollback mechanism
Updated Rake-specific documentation to include storage rollback,
and improved migration and rollback instructions.
Diffstat (limited to 'doc/administration/repository_storage_types.md')
-rw-r--r-- | doc/administration/repository_storage_types.md | 111 |
1 files changed, 67 insertions, 44 deletions
diff --git a/doc/administration/repository_storage_types.md b/doc/administration/repository_storage_types.md index 4934aaf39f7..40f7c5566ac 100644 --- a/doc/administration/repository_storage_types.md +++ b/doc/administration/repository_storage_types.md @@ -2,6 +2,24 @@ > [Introduced][ce-28283] in GitLab 10.0. +Two different storage layouts can be used +to store the repositories on disk and their characteristics. + +GitLab can be configured to use one or multiple repository shard locations +that can be: + +- Mounted to the local disk +- Exposed as an NFS shared volume +- Acessed via [gitaly] on its own machine. + +In GitLab, this is configured in `/etc/gitlab/gitlab.rb` by the `git_data_dirs({})` +configuration hash. The storage layouts discussed here will apply to any shard +defined in it. + +The `default` repository shard that is available in any installations +that haven't customized it, points to the local folder: `/var/opt/gitlab/git-data`. +Anything discussed below is expected to be part of that folder. + ## Legacy Storage Legacy Storage is the storage behavior prior to version 10.0. For historical @@ -66,34 +84,7 @@ by another folder with the next 2 characters. They are both stored in a special "@hashed/#{hash[0..1]}/#{hash[2..3]}/#{hash}.wiki.git" ``` -### How to migrate to Hashed Storage - -In GitLab, go to **Admin > Settings**, find the **Repository Storage** section -and select "_Use hashed storage paths for newly created and renamed projects_". - -To migrate your existing projects to the new storage type, check the specific -[rake tasks]. - -[ce-28283]: https://gitlab.com/gitlab-org/gitlab-ce/issues/28283 -[rake tasks]: raketasks/storage.md#migrate-existing-projects-to-hashed-storage -[storage-paths]: repository_storage_types.md - -#### Rollback - -There is no automated rollback implemented. Below are the steps required to rollback -from each storage migration. - -The rollback has to be performed in the reverse order. To get into "Legacy" state, -you need to rollback Attachments first, then Project. - -Also note that if Geo is enabled, after the migration was triggered, an event is generated -to replicate the operation on any Secondary node. That means the on disk changes will also -need to be performed on these nodes as well. Database changes will propagate without issues. - -You must make sure the migration event was already processed or otherwise it may migrate -the files back to Hashed state again. - -#### Hashed object pools +### Hashed object pools For deduplication of public forks and their parent repository, objects are pooled in an object pool. These object pools are a third repository where shared objects @@ -110,36 +101,60 @@ enabled for individual projects by executing be on hashed storage, should not be a fork itself, and hashed storage should be enabled for all new projects. -##### Attachments +### How to migrate to Hashed Storage -To rollback single Attachment migration, rename `aa/bb/abcdef1234567890...` folder back to `namespace/project`. +To start a migration, enable Hashed Storage for new projects: + +1. Go to **Admin > Settings** and expand the **Repository Storage** section. +2. Select the **Use hashed storage paths for newly created and renamed projects** checkbox. -Both folder names can be generated by the `FileUploader.absolute_base_dir(project)`, you -just need to switch the version from the `project` back to the previous one. +Check if the change breaks any existing integration you may have that +either runs on the same machine as your repositories are located, or may login to that machine +to access data (for example, a remote backup solution). -```ruby -project.storage_version -# => 2 +To schedule a complete rollout, see the +[rake task documentation for storage migration][rake/migrate-to-hashed] for instructions. -FileUploader.absolute_base_dir(project) -# => "/opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35" +If you do have any existing integration, you may want to do a small rollout first, +to validate. You can do so by specifying a range with the operation. -project.storage_version = 1 +This is an example of how to limit the rollout to Project IDs 50 to 100, running in +an Omnibus Gitlab installation: -FileUploader.absolute_base_dir(project) -# => "/opt/gitlab/embedded/service/gitlab-rails/public/uploads/gitlab/gitlab-shell-renamed" +```bash +sudo gitlab-rake gitlab:storage:migrate_to_hashed ID_FROM=50 ID_TO=100 ``` -##### Project +Check the [documentation][rake/migrate-to-hashed] for additional information and instructions for +source-based installation. + +#### Rollback + +Similar to the migration, to disable Hashed Storage for new +projects: -To rollback single Project migration, move `@hashed/aa/bb/aabbcdef1234567890abcdef.git` and `@hashed/aa/bb/aabbcdef1234567890abcdef.wiki.git` -back to `namespace/project.git` and `namespace/project.wiki.git` respectively and switch the version from the `project` back to `null`. +1. Go to **Admin > Settings** and expand the **Repository Storage** section. +2. Uncheck the **Use hashed storage paths for newly created and renamed projects** checkbox. + +To schedule a complete rollback, see the +[rake task documentation for storage rollback][rake/rollback-to-legacy] for instructions. + +The rollback task also supports specifying a range of Project IDs. Here is an example +of limiting the rollout to Project IDs 50 to 100, in an Omnibus Gitlab installation: + +```bash +sudo gitlab-rake gitlab:storage:rollback_to_legacy ID_FROM=50 ID_TO=100 +``` + +If you have a Geo setup, please note that the rollback will not be reflected automatically +on the **secondary** node. You may need to wait for a backfill operation to kick-in and remove +the remaining repositories from the special `@hashed/` folder manually. ### Hashed Storage coverage We are incrementally moving every storable object in GitLab to the Hashed Storage pattern. You can check the current coverage status below (and also see -the [issue](https://gitlab.com/gitlab-com/infrastructure/issues/2821)). +the [issue][ce-2821]). Note that things stored in an S3 compatible endpoint will not have the downsides mentioned earlier, if they are not prefixed with `#{namespace}/#{project_name}`, @@ -156,6 +171,7 @@ which is true for CI Cache and LFS Objects. | CI Artifacts | No | No | Yes | 9.4 / 10.6 | | CI Cache | No | No | Yes | - | | LFS Objects | Yes | Similar | Yes | 10.0 / 10.7 | +| Repository pools| No | Yes | - | 11.6 | #### Implementation Details @@ -180,3 +196,10 @@ LFS Objects implements a similar storage pattern using 2 chars, 2 level folders, ``` They are also S3 compatible since **10.0** (GitLab Premium), and available in GitLab Core since **10.7**. + +[ce-2821]: https://gitlab.com/gitlab-com/infrastructure/issues/2821 +[ce-28283]: https://gitlab.com/gitlab-org/gitlab-ce/issues/28283 +[rake/migrate-to-hashed]: raketasks/storage.md#migrate-existing-projects-to-hashed-storage +[rake/rollback-to-legacy]: raketasks/storage.md#rollback +[storage-paths]: repository_storage_types.md +[gitaly]: gitaly/index.md |