Welcome to mirror list, hosted at ThFree Co, Russian Federation.

repository_storage_paths.md « administration « doc - gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: a3e0158fd2486a173072a2f2faab244f1d149028 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
---
stage: Systems
group: Gitaly
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
---

# Repository storage **(FREE SELF)**

GitLab stores [repositories](../user/project/repository/index.md) on repository storage. Repository
storage is either:

- Physical storage configured with a `gitaly_address` that points to a [Gitaly node](gitaly/index.md).
- [Virtual storage](gitaly/index.md#virtual-storage) that stores repositories on a Gitaly Cluster.

WARNING:
Repository storage could be configured as a `path` that points directly to the directory where the repositories are
stored. GitLab directly accessing a directory containing repositories is deprecated. You should configure GitLab to
access repositories through a physical or virtual storage.

For more information on:

- Configuring Gitaly, see [Configure Gitaly](gitaly/configure_gitaly.md).
- Configuring Gitaly Cluster, see [Configure Gitaly Cluster](gitaly/praefect.md).

## Hashed storage

> **Storage name** field [renamed](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/128416) from **Gitaly storage name** and **Relative path** field [renamed](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/128416) from **Gitaly relative path** in GitLab 16.3.

Hashed storage stores projects on disk in a location based on a hash of the project's ID. This makes the folder
structure immutable and eliminates the need to synchronize state from URLs to disk structure. This means that renaming a
group, user, or project:

- Costs only the database transaction.
- Takes effect immediately.

The hash also helps spread the repositories more evenly on the disk. The top-level directory
contains fewer folders than the total number of top-level namespaces.

The hash format is based on the hexadecimal representation of a SHA256, calculated with
`SHA256(project.id)`. The top-level folder uses the first two characters, followed by another folder
with the next two characters. They are both stored in a special `@hashed` folder so they can
co-exist with existing legacy storage projects. For example:

```ruby
# Project's repository:
"@hashed/#{hash[0..1]}/#{hash[2..3]}/#{hash}.git"

# Wiki's repository:
"@hashed/#{hash[0..1]}/#{hash[2..3]}/#{hash}.wiki.git"
```

### Translate hashed storage paths

Troubleshooting problems with the Git repositories, adding hooks, and other tasks requires you
translate between the human-readable project name and the hashed storage path. You can translate:

- From a [project's name to its hashed path](#from-project-name-to-hashed-path).
- From a [hashed path to a project's name](#from-hashed-path-to-project-name).

#### From project name to hashed path

Administrators can look up a project's hashed path from its name or ID using:

- The [Admin Area](../administration/admin_area.md#administering-projects).
- A Rails console.

To look up a project's hash path in the Admin Area:

1. On the left sidebar, select **Search or go to**.
1. Select **Admin Area**.
1. On the left sidebar, select **Overview > Projects** and select the project.
1. Locate the **Relative path** field. The value is similar to:

   ```plaintext
   "@hashed/b1/7e/b17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9.git"
   ```

To look up a project's hash path using a Rails console:

1. Start a [Rails console](operations/rails_console.md#starting-a-rails-console-session).
1. Run a command similar to this example (use either the project's ID or its name):

   ```ruby
   Project.find(16).disk_path
   Project.find_by_full_path('group/project').disk_path
   ```

#### From hashed path to project name

Administrators can look up a project's name from its hashed relative path using:

- A Rails console.
- The `config` file in the `*.git` directory.

To look up a project's name using the Rails console:

1. Start a [Rails console](operations/rails_console.md#starting-a-rails-console-session).
1. Run a command similar to this example:

   ```ruby
   ProjectRepository.find_by(disk_path: '@hashed/b1/7e/b17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9').project
   ```

The quoted string in that command is the directory tree you can find on your GitLab server. For
example, on a default Linux package installation this would be `/var/opt/gitlab/git-data/repositories/@hashed/b1/7e/b17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9.git`
with `.git` from the end of the directory name removed.

The output includes the project ID and the project name. For example:

```plaintext
=> #<Project id:16 it/supportteam/ticketsystem>
```

To look up a project's name using the `config` file in the `*.git` directory:

1. Locate the `*.git` directory. This directory is located in `/var/opt/gitlab/git-data/repositories/@hashed/`, where the first four
   characters of the hash are the first two directories in the path under `@hashed/`. For example, on a default Linux package installation the
   `*.git` directory of the hash `b17eb17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9` would be
   `/var/opt/gitlab/git-data/repositories/@hashed/b1/7e/b17ef6d19c7a5b1ee83b907c595526dcb1eb06db8227d650d5dda0a9f4ce8cd9.git`.
1. Open the `config` file and locate the `fullpath=` key under `[gitlab]`.

### Hashed object pools

Object pools are repositories used to deduplicate forks of public and internal projects and
contain the objects from the source project. Using `objects/info/alternates`, the source project and
forks use the object pool for shared objects. For more information, see
[How Git object deduplication works in GitLab](../development/git_object_deduplication.md).

Objects are moved from the source project to the object pool when housekeeping is run on the source
project. Object pool repositories are stored similarly to regular repositories in a directory called `@pools` instead of `@hashed`

```ruby
# object pool paths
"@pools/#{hash[0..1]}/#{hash[2..3]}/#{hash}.git"
```

WARNING:
Do not run `git prune` or `git gc` in object pool repositories, which are stored in the `@pools` directory.
This can cause data loss in the regular repositories that depend on the object pool.

### Group wiki storage

Unlike project wikis that are stored in the `@hashed` directory, group wikis are stored in a directory called `@groups`.
Like project wikis, group wikis follow the hashed storage folder convention, but use a hash of the group ID rather than the project ID.

For example:

```ruby
# group wiki paths
"@groups/#{hash[0..1]}/#{hash[2..3]}/#{hash}.wiki.git"
```

### Gitaly Cluster storage

If Gitaly Cluster is used, Praefect manages storage locations. The internal path used by Praefect for the repository
differs from the hashed path. For more information, see
[Praefect-generated replica paths](gitaly/index.md#praefect-generated-replica-paths-gitlab-150-and-later).

### Object storage support

This table shows which storable objects are storable in each storage type:

| Storable object  | Hashed storage | S3 compatible |
|:-----------------|:---------------|:--------------|
| Repository       | Yes            | -             |
| Attachments      | Yes            | -             |
| Avatars          | No             | -             |
| Pages            | No             | -             |
| Docker Registry  | No             | -             |
| CI/CD job logs   | No             | -             |
| CI/CD artifacts  | No             | Yes           |
| CI/CD cache      | No             | Yes           |
| LFS objects      | Similar        | Yes           |
| Repository pools | Yes            | -             |

Files stored in an S3-compatible endpoint can have the same advantages as
[hashed storage](#hashed-storage), as long as they are not prefixed with
`#{namespace}/#{project_name}`. This is true for CI/CD cache and LFS objects.

#### Avatars

Each file is stored in a directory that matches the `id` assigned to it in the database. The
filename is always `avatar.png` for user avatars. When an avatar is replaced, the `Upload` model is
destroyed and a new one takes place with a different `id`.

#### CI/CD artifacts

CI/CD artifacts are S3-compatible.

#### LFS objects

[LFS Objects in GitLab](../topics/git/lfs/index.md) implement a similar
storage pattern using two characters and two-level folders, following the Git implementation:

```ruby
"shared/lfs-objects/#{oid[0..1}/#{oid[2..3]}/#{oid[4..-1]}"

# Based on object `oid`: `8909029eb962194cfb326259411b22ae3f4a814b5be4f80651735aeef9f3229c`, path will be:
"shared/lfs-objects/89/09/029eb962194cfb326259411b22ae3f4a814b5be4f80651735aeef9f3229c"
```

LFS objects are also [S3-compatible](lfs/index.md#storing-lfs-objects-in-remote-object-storage).

## Configure where new repositories are stored

After you configure multiple repository storages, you can choose where new repositories are stored:

1. On the left sidebar, select **Search or go to**.
1. Select **Admin Area**.
1. On the left sidebar, select **Settings > Repository** and expand the **Repository storage**
   section.
1. Enter values in the **Storage nodes for new repositories** fields.
1. Select **Save changes**.

Each repository storage path can be assigned a weight from 0-100. When a new project is created,
these weights are used to determine the storage location the repository is created on.

The higher the weight of a given repository storage path relative to other repository storages
paths, the more often it is chosen (`(storage weight) / (sum of all weights) * 100 = chance %`).

By default, if repository weights have not been configured earlier:

- `default` is weighted `100`.
- All other storages are weighted `0`.

NOTE:
If all storage weights are `0` (for example, when `default` does not exist), GitLab attempts to
create new repositories on `default`, regardless of the configuration or if `default` exists.
See [the tracking issue](https://gitlab.com/gitlab-org/gitlab/-/issues/36175) for more information.

## Move repositories

To move a repository to a different repository storage (for example, from `default` to `storage2`), use the
same process as [migrating to Gitaly Cluster](gitaly/index.md#migrate-to-gitaly-cluster).