Welcome to mirror list, hosted at ThFree Co, Russian Federation.

index.md « backup_restore « administration « doc - gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: cefca8ce469bf16500a69cdb70b9b11c9367634d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
---
stage: Systems
group: Geo
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
---

# Back up and restore GitLab **(FREE SELF)**

Your software or organization depends on the data in your GitLab instance. You need to ensure this data is protected from adverse events such as:

- Corrupted data
- Accidental deletion of data
- Ransomware attacks
- Unexpected cloud provider downtime

You can mitigate all of these risks with a disaster recovery plan that includes backups.

## Back up GitLab

For detailed information on backing up GitLab, see [Backup GitLab](backup_gitlab.md).

## Restore GitLab

For detailed information on restoring GitLab, see [Restore GitLab](restore_gitlab.md).

## Migrate to a new server

For detailed information on using back up and restore to migrate to a new server, see
[Migrate to a new server](migrate_to_new_server.md).

## Additional notes

This documentation is for GitLab Community and Enterprise Edition. We back up
GitLab.com and ensure your data is secure. You can't, however, use these
methods to export or back up your data yourself from GitLab.com.

Issues are stored in the database, and can't be stored in Git itself.

## GitLab backup archive creation process

When working with GitLab backups, you might need to know how GitLab creates backup archives. To create backup archives, GitLab:

1. If creating an incremental backup, extracts the previous backup archive and read its `backup_information.yml` file.
1. Updates or generates the `backup_information.yml` file.
1. Runs all backup sub-tasks:
   - `db` to backup the GitLab PostgreSQL database (not Gitaly Cluster).
   - `repositories` to back up Git repositories.
   - `uploads` to back up attachments.
   - `builds` to back up CI job output logs.
   - `artifacts` to back up CI job artifacts.
   - `pages` to back up page content.
   - `lfs` to back up LFS objects.
   - `terraform_state` to back up Terraform states.
   - `registry` to back up container registry images.
   - `packages` to back up packages.
   - `ci_secure_files` to back up project-level secure files.
1. Archives the backup staging area into a tar file.
1. Optional. Uploads the new backup archive to object-storage.
1. Cleans up backup staging directory files that are now archived.

## Backup ID

Backup IDs identify individual backup archives. You need the backup ID of a backup archive if you need to restore GitLab and multiple backup archives are available.

Backup archives are saved in a directory set in `backup_path`, which is specified in the `config/gitlab.yml` file.

- By default, backup archives are stored in `/var/opt/gitlab/backups`.
- By default, backup archive file names are `<backup-id>_gitlab_backup.tar` where `<backup-id>` identifies the time when the
  backup archive was created, the GitLab version, and the GitLab edition.

For example, if the archive file name is `1493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar`,
the backup ID is `1493107454_2018_04_25_10.6.4-ce`.

## Backup staging directory

The backup staging directory is a place to temporarily:

- Store backup artifacts on disk before the GitLab backup archive is created.
- Extract backup archives on disk before restoring a backup or creating an incremental backup.

This directory is the same directory where completed GitLab backup archives are created. When creating an untarred backup, the backup artifacts are left in this directory and no
archive is created.

Example backup staging directory with untarred backup:

```plaintext
backups/
├── 1701728344_2023_12_04_16.7.0-pre_gitlab_backup.tar
├── 1701728447_2023_12_04_16.7.0-pre_gitlab_backup.tar
├── artifacts.tar.gz
├── backup_information.yml
├── builds.tar.gz
├── ci_secure_files.tar.gz
├── db
│   ├── ci_database.sql.gz
│   └── database.sql.gz
├── lfs.tar.gz
├── packages.tar.gz
├── pages.tar.gz
├── repositories
│   ├── manifests/
│   ├── @hashed/
│   └── @snippets/
├── terraform_state.tar.gz
└── uploads.tar.gz
```

## `backup_information.yml` file

The `backup_information.yml` file saves all backup inputs that are not included in the backup itself. It includes information such as:

- The time the backup was created.
- The version of GitLab that generated the backup.
- Any options that were specified, such as skipped sub-tasks.

This information is used by some sub-tasks to determine how:

- To restore.
- To link data in the backup with external services (such as server-side repository backups).

This file is saved into the backup staging directory.

## Database backups

Database backups are created and restored by a GitLab backup sub-task called `db`. The database sub-task uses `pg_dump` to create [a SQL dump](https://www.postgresql.org/docs/14/backup-dump.html). The output of `pg_dump` is piped through `gzip` in order to create a compressed SQL file. This file is saved to the backup staging directory.

## Repository backups

Repository backups are created and restored by a GitLab backup sub-task called `repositories`. The repositories sub-task uses a Gitaly command
[`gitaly-backup`](https://gitlab.com/gitlab-org/gitaly/-/blob/master/doc/gitaly-backup.md) to create Git repository backups:

- GitLab uses its database to tell `gitaly-backup` which repositories to back up.
- `gitaly-backup` then calls a series of RPCs on Gitaly to collect the repository backup data for each repository. This data is streamed into a directory structure in the GitLab backup staging directory.

```mermaid
sequenceDiagram
    box Backup host
        participant Repositories sub-task
        participant gitaly-backup
    end

    Repositories sub-task->>+gitaly-backup: List of repositories

    loop Each repository
        gitaly-backup->>+Gitaly: ListRefs request
        Gitaly->>-gitaly-backup: List of Git references

        gitaly-backup->>+Gitaly: CreateBundleFromRefList request
        Gitaly->>-gitaly-backup: Git bundle file

        gitaly-backup->>+Gitaly: GetCustomHooks request
        Gitaly->>-gitaly-backup: Custom hooks archive
    end

    gitaly-backup->>-Repositories sub-task: Success/failure
```

Storages configured to Gitaly Cluster are backed up the same as standalone Gitaly. When Gitaly Cluster receives the RPC calls from `gitaly-backup`, it is responsible for
rebuilding its own database. This means that there is no need to backup the Gitaly Cluster database separately. Because backups operate through RPCs, each repository is only backed
up once no matter the replication factor.

### Server-side repository backups

You can configure repository backups as server-side repository backups. When specified, `gitaly-backup` makes a single RPC call for each repository to create the backup. This RPC
does not transmit any repository data. Instead, the RPC triggers the Gitaly node that stores that physical repository to upload the backup data directly to object-storage. Because
the data is no longer transmitted through RPCs from Gitaly, server-side backups require much less network transfer and require no disk storage on the machine that is running the
backup Rake task. The backups stored on object-storage are linked to the created backup archive by [the backup ID](#backup-id).

```mermaid
sequenceDiagram
    box Backup host
        participant Repositories sub-task
        participant gitaly-backup
    end

    Repositories sub-task->>+gitaly-backup: List of repositories

    loop Each repository
        gitaly-backup->>+Gitaly: BackupRepository request

        Gitaly->>+Object-storage: Git references file
        Object-storage->>-Gitaly: Success/failure

        Gitaly->>+Object-storage: Git bundle file
        Object-storage->>-Gitaly: Success/failure

        Gitaly->>+Object-storage: Custom hooks archive
        Object-storage->>-Gitaly: Success/failure

        Gitaly->>+Object-storage: Backup manifest file
        Object-storage->>-Gitaly: Success/failure

        Gitaly->>-gitaly-backup: Success/failure
    end

    gitaly-backup->>-Repositories sub-task: Success/failure
```

## File backups

The following GitLab backup sub-tasks back up files:

- `uploads`
- `builds`
- `artifacts`
- `pages`
- `lfs`
- `terraform_state`
- `registry`
- `packages`
- `ci_secure_files`

These file sub-tasks determine a set of files within a directory specific to the task. This set of files is then passed to `tar`
to create an archive. This archive is piped (not saved to disk) through `gzip` to save a compressed tar file to the backup staging directory.

Because backups are created from live instances, the files that tar is trying to archive can sometimes be modified while creating the backup. In this case, an alternate "copy"
strategy can be used. When this strategy is used, `rsync` is first used to create a copy of the files to back up. Then, these copies are passed to `tar` as usual. In this case,
the machine running the backup Rake task must have enough storage for the copied files and the compressed archive.

## Related features

- [Geo](../geo/index.md)
- [Disaster Recovery (Geo)](../geo/disaster_recovery/index.md)
- [Migrating GitLab groups](../../user/group/import/index.md)
- [Import and migrate projects](../../user/project/import/index.md)
- [GitLab Linux package (Omnibus) - Backup and Restore](https://docs.gitlab.com/omnibus/settings/backups.html)
- [GitLab Helm chart - Backup and Restore](https://docs.gitlab.com/charts/backup-restore/)
- [GitLab Operator - Backup and Restore](https://docs.gitlab.com/operator/backup_and_restore.html)