Age | Commit message (Collapse) | Author |
|
|
|
git: Sync internal namespaces with distros' settings
Closes #4237
See merge request gitlab-org/gitaly!4562
|
|
Update "Support Request" template in line with Gitaly's intake workflow.
See merge request gitlab-org/gitaly!4552
|
|
Handle repository creations, deletions and renames atomically
Closes #4003 and #3485
See merge request gitlab-org/gitaly!4101
|
|
ruby: Fix diverging Git configuration in Go and Ruby
Closes #4236 and #4230
See merge request gitlab-org/gitaly!4557
|
|
[ci skip]
|
|
The Ruby sidecar is still setting up `core.autocrlf` in some cases. Now
that we inject Git configuration via environment variables into the Ruby
sidecar though it's not required anymore to set this configuration:
- While libgit2 doesn't read the environment variables we're
injecting, it is not in fact using `core.autocrlf` in any of the
code paths we hit in the Ruby sidecar.
- The Git commands we spawn, of which only git-symbolic-ref(1) and
git-fetch(1) are left, don't care for the value of this config
either. But even if they did, the Git command factory does know to
set this value in the global Git configuration and thus we'd also
get it injected into the Ruby sidecar via environment variables.
So let's stop setting this key manually given that it's not needed
anymore.
|
|
Right now, we have multiple sources of truth for how Git is configured.
In Go, most of the configuration is handled centrally by Gitaly itself,
includes mandatory configuration that must always be set. This config
supersedes and overrides any values in the global gitconfig that is
installed by both Omnibus and CNG. The goal here is that Gitaly becomes
the single source of truth for the Git configuration eventually and that
it starts ignoring any system- or global-level configuration that exists
in the filesystem. With this schema, we have a bunch of benefits:
- We exactly know about the configuration that Git commands are in.
- We don't need to rely on the repository's gitconfig anymore like
we did in the past.
- We have much tighter control over the execution environment and
can also easily set up per-command configuration.
- We can conditionally inject Git configuration based on the Git
version that's currently running.
While we already live in this world for quite a while in our Go code
base, the Ruby sidecar is a different story. Here we don't use any of
the central configuration yet, but instead we still rely on the files
installed by both CNG and Omnibus. Which effectively means that we
cannot currently stop installing them, or otherwise some important
config entries that we do alread inject in Go might get lost.
Unfortunately, this has come to bite us just now where we're close to
removing the sidecar altogether: starting with Git v2.36.0, it prints a
warning when `core.fsyncObjectFiles` is set. This means that we're
forced to conditionally either inject this configuration when running
with an older Git version, or alternatively we must inject the new Git
configuration `core.fsync`. And while this is easy enough to do in Go
land, the sidecar can't do that at all because it is forced to rely on
the gitconfig files installed by distributions.
Let's fix this issue by injecting the Git configuration as set up by the
Git command factory in the Ruby sidecar via environment variables. This
ensures that Git commands spawned by it will share their configuration
with what Go uses, including `core.fsyncObjectFiles` or `core.fsync`
depending on the current Git version. Given that we now know that those
options are always set in the context of Ruby we can then strip away the
old setting that is causing warnings in both CNG and Omnibus.
Changelog: fixed
|
|
The configuration values returned by `fsckConfiguration()` are not
ordered deterministically because we're iterating over a map. While this
doesn't matter in production, it makes things harder to test.
Refactor the code to instead loop over an array to fix this.
|
|
We'll need to convert config pairs into Git configuration environment
variables so that we can inject them into the Ruby sidecar. Extract the
code that does this from `WithConfigEnv()` into a separate helper to
prepare for this change.
|
|
Extract the construction of global Git configuration that applies to
every spawned Git command into its own function. This is a preparatory
step for implementing a function that assembles the Git configuration as
required for the Ruby sidecar.
|
|
Extended ListCommits to support filter by commit message
See merge request gitlab-org/gitaly!4421
|
|
testcfg: Fix building Go binaries with Go 1.18
Closes #4178
See merge request gitlab-org/gitaly!4561
|
|
cgroups: Repository isolated cgroups
See merge request gitlab-org/gitaly!4520
|
|
repository: Exclude alternate object directories in repository size
See merge request gitlab-org/gitaly!4558
|
|
TestRemoveRepository has some custom assertions to check whether DB
records exist or not. This commit replaces those assertions by a call
to RepositoryExists. RepositoryExists also checks in the database
whether the repository exists or not, so this just replaces custom
assertions by more use of production code.
|
|
The RemoveRepository subcommand is working around the lack of atomicity
in Praefect's repository removals. It manually cleaned up database state
and issued deletes Gitalys to ensure deletes go through even if there
is no metadata for the repository. This ensured there would be no conflicts
when recreating a repository after running the command.
These workarounds are no longer necessary as Praefect is now handling
deletions. As soon as the repository's metadata is deleted, it stops
existing from Praefects point of view. New repository created in the same
virtual storage and relative path lands in a different directory on the
Gitalys. As such, we can simplify the command and simply call Praefect's
RemoveRepository to delete the repository's metadata and to delete the
known replicas.
The command was changed to error if the repository does not exist as it
can't do anything if so. Likewise, the output was changed to not mention
the nodes the repository exists on as it's mostly irrelevant. The deletion
may succeed as far as the user cares but some replicas may still be left
on the disk if Praefect was unable to remove them.
Changelog: changed
|
|
Renaming, creating and deleting repositories is racy in Praefect.
They can also partially fail in awkward ways due to Praefect first
applying the operations on the disks of the Gitalys and later updating
its metadata store. In order to make these operations atomic, there's
been ongoing work in the background to make it possible to perform these
in the database in a manner that races are not possible and partial
failures do not end up conflicting with future operations.
Renames can be fixed by doing the rename atomically in the database
without moving any repositories on the disks.
Deletes can be modeled as simply deleting the repository's database record.
Creates are atomic by creating the repository's database record as the
last step in the process.
The last piece of the puzzle is to ensure repositories always land in different
directories on the disk. This ensures that a partial failure doesn't block
a future operation from succeeding. This commit implements that piece by deriving
a unique path for each repository to store their replicas. Existing repositories
stay where they are but newly created repositories will land in unique directories.
Create might succeed on a disk but fail to be persisted in the database. Prior to
this change, attempting to recreate a repository might fail due to the stale state
on the disk. With this in place, the next attempt at creating the repository would
still succeed as the new attempt would land the repository in a different directory
on the disk.
Deletes have the same problem prior to this commit. The repository's metadata may
be successfully deleted but if we fail to delete the repository from the disk, future
creations and renames may fail if they conflict with the stale state. As creates now
always land in a different directory, the stale state no longer causes problems.
Renames will work purely in the database, so any stale state on the disk will not affect
them.
Changelog: fixed
|
|
Praefect currently handles RenameRepository like every other mutator.
This is problematic as Praefect ends up renaming repositories first on
the disks of the Gitalys and then in the database. If only the first
operation succeeds, Praefect has effectively lost track of the repositories.
With Praefect now generating unique relative paths for each repository,
there's no need to actually move the repositories on the disks anymore.
It is sufficient to rename the repository only in the database and leave the
replicas in their existing locations on disk. This commit intercepts
RenameRepository in Praefect and renames repositories only in the database.
The atomic rename handling will be rolled out progressively with Praefect
generated replica paths as the change does not make sense on its own.
To do so, the RenameRepositoryFeatureFlagger checks the replica path of
the repository and only applies the new logic if the repository has a
Praefect generated replica path.
Changelog: fixed
|
|
This commit adds RenameRepositoryInPlace to RepositoryStore. The new method
will replace RenameRepository which can be removed in a later release.
Unlike RenameRepository, renames doesn't change the replica path stored in
the database. As Praefect is now routing using the replica_path, this enables
Praefect to rename repositories atomically in the database without necessiating
any disk changes. The old RenameRepository is left in place so any in-flight rename
jobs can still be processed by it.
|
|
repository: Log repo size calculations
See merge request gitlab-org/gitaly!4556
|
|
Gitaly is relying on the @pools prefix in OptimizeRepository to
avoid pruning object pools. Pruning object pools could lead to data
loss if some pool members still need the pruned objects. To ensure
Gitaly can identify object pools from the other repositories after
their relative paths have been rewritten, this commit adds the
DerivePoolPath function that will be used in the router to derive
the replica paths for object pools. Doing so, the replica paths have
been changed to include a @cluster prefix, which allows for grouping
the cluster's repositories and pools under a common directory. Using
the existing @Pools directory would also be possible as Rails hashes
the repository ids where as Praefect doesn't. However, it's clearer
to separate them in different directories and leave ownership to each
service minting paths so they can ensure independently there are no
conflicts.
|
|
To test scenarios where the replica pointed to by a metadata record
does not exist, the verifier deletes replicas from Gitalys as part of
its test setup. It does so by pointing the calls to the relative path
and does not resolve the replica path. This causes the tests to fail
when Praefect generates unique replica paths. This commit resolves the
replica path in the tests so the deletions always target the correct
repositories. As the helpers are mostly designed with Gitaly in mind,
the GetReplicaPath helper currently expects a Gitaly config which it
will use to dial the service. In Praefect's context, we don't have such
a config available nor do we even have the address of the server. This
commit expands the helper to take a ClientConn in to use optionally as
the other helpers do.
|
|
The cgroups key did not have a delimeter. This can potentially cause key
collisions between two different sets of storage/repository path. Add a
"/" as a delimeter in the key that gets hashed.
|
|
In be0ee06 (Introduce new Cgroups config 2022-04-17) we introduced a new
Cgroups configuration that supports repository-isolation for command
execution.
In this change, we implement the repository-based isolation cgroups.
Commands are placed into cgroups based on the repository path, so that
all commands for a given repository will end up in the same cgroup. This
helps to prevent one user from potentially starving the whole system of
memory or CPU.
Changelog: changed
|
|
When calculating repository size with rev-list, it will count the
objects that reside in a pool repository. We don't want to count pool
repository storage as part of the repository storage, so include an
option to exclude counting any objects that live in an alternate object
directory.
Changelog: changed
|
|
To prepare for tests that need to call the object pool service, include
the object pool service in the server setup. Also, add a method to
create an object pool client.
|
|
Now that we have an ExcludeAlternates, Excludes feels too general of a
variable. Rename this to ExcludeRefs to make it clear that it's a slice
of ref globs that are going to be excluded from the repo size
calculation.
|
|
Our testcfg package needs to build some of our Go binaries ad-hoc to
make them available to our testing infrastructure. With Go 1.18 this
functionality is broken though: Go tries to automatically detect VCS
information to embed it into the resulting binaries. But because we're
intercepting all Git invocations that are resolved via the system's
`PATH` variable in order to verify that we don't ever do that, Go will
also use that intercepting script and consequentially get an error.
Fix this regression by setting up a Git command factory and tweaking the
environment to make Go use its binaries.
Changelog: fixed
|
|
There are times when we need to exclude the alternates directory from
the repository size calculation. Provide an option to do so.
|
|
Now that we are trying to extend ListCommits rpc of CommitService to
include various cases. It's updated to include functionality provided
by CommitsByMessage rpc and also support regex ignore case flag.
Changelog: changed
|
|
operations: Return detailed errors for hook failures in UserMergeBranch
See merge request gitlab-org/gitaly!4549
|
|
Repositories hosted by Gitaly have several internal reference namespaces
that shouldn't be modified by clients e.g. by pushing to them. And while
we have the infrastructure in place already to ensure that writes are
restricted, the list of internal namespace is not in sync with the list
of namespaces configured in both Omnibus and CNG. Most notably, we're
missing both the `refs/tmp/` and `refs/remotes/` namespaces.
Add the missing namespaces to ensure that we're not regressing our
safety guards in this area when we remove the gitconfig in our distros.
Changelog: changed
|
|
Once the new configuration rolls out, we need to support old
configurations for the time being until 16.0 when we can deprecate the
old cgroups config. Until then, allow for a translation layer that
translates the old version to the new version.
|
|
Because we are switching to a new way of calculating repo sizes, it
would be helpful to know what the discrepancy is so that if it is large
and unexpedted, there's a way to know.
Changelog: changed
|
|
|
|
doc: Update Protobuf verification docs
See merge request gitlab-org/gitaly!4551
|
|
The 'Contributing' section of the Protobuf doc still references the old
`gitaly-proto` project and incorrectly suggests that the source files
generated by `protoc` may vary with Go version.
This removes the outdated and incorrect information and adds a new
section with details on the `make` targets for linting and validating
protos.
|
|
Revert "Merge branch 'pks-makefile-workaround-build-id-rebuilding' into 'master'"
See merge request gitlab-org/gitaly!4550
|
|
'master'"
This reverts commit 94a955f7bac56cb8f524f43a7773038e6e341585, reversing
changes made to 708408a8ad99f942c9cfd40f43ec11b961d31846.
|
|
downgrade json gem to match gitlab version
See merge request gitlab-org/gitaly!4547
|
|
Makefile: Fix rebuilding Go binaries in place to add GNU build ID
See merge request gitlab-org/gitaly!4544
|
|
Fix Default templates
See merge request gitlab-org/gitaly!4545
|
|
Custom server-side hooks provide a feature where if they return an
error, anything that was printed to stdout or stderr and is prefixed
with "GL-HOOK-ERR":" will be displayed in the web interface. The
interface to make this work between Rails and Gitaly is really fragile
though: Rails relies on Gitaly passing up the hook error without any
further decorations so that it can directly check whether the error has
the specified prefix.
This calling convention is not documented in any public interface, but
it has been like this since Gitaly has been born out of Rails. Of
course, people repeatedly didn't know about this small detail and would
improve error messages in a quest to make logging more useful in this
context. And they cannot be blamed: the calling convention is simply
bad.
Now that we have bought into gRPC's detailed error model though we can
handle this problem in a much better way. Instead of relying on the
exact error string returned by Gitaly, we use the new `CustomHookError`
Protobuf message and directly tell Rails what the standard output and
standard error streams contained. This will eventually free up the use
of our errors so that we can make them meaningful, all while finally
having an explicit and documented calling convention to bubble up hook
errors to the caller.
Note that this change is not done behind a feature flag: the actual
error message returned to Rails remains the same for now. We only use a
proper error code now and embed the detailed error. And the way Rails
inspects errors right now, nothing changes. Furthermore, it is currently
broken anyway, so there doesn't seem to be much of a point to be careful
and try not to break something that is already broken.
Changelog: changed
|
|
The `CustomHookError` doesn't carry enough information right now to tell
callers all about the context it's been invoked in. Most importantly,
the error doesn't contain any information about the hook type that was
returning an error, and this may very well be information that callers
want to know given that there are different guaranteets with regards to
data persistence depending on which hook ran.
Improve the error to also record the `hookType` that has failed.
|
|
We have a `HookError` type that gets returned whenever the execution of
a Git hook fails. In general though the informaton that this error holds
is not of any importance, but we only really care about special-handling
errors raised by custom hooks.
Refactor the error to a more specialized `CustomHookError` to make it
more useful.
|
|
We're returning specific `HookError`s from our hook updater because
Rails cares about the standard output and error streams of custom hooks
so that it can decide whether to display any messages contained in
either of them in the web interface. So while it does make sense to
pass up the hook outputs for our normal hooks, we're also creating a
`HookError` for the reference-transaction hook. This doesn't make a
whole lot of sense though: we do not support custom hooks for this hook
type, and thus the standard streams shouldn't contain any errors that
might want to be displayed in the UI.
Refactor the code to return normal errors when the execution of the
reference-transaction hook fails. This prepares for a refactoring where
we convert the `HookError` type to a `CustomHookError` type.
|
|
Our error handling for custom hooks is extremely fragile: when a custom
hook returns a non-zero value, then we must make sure to pass either its
stderr or stdout to Rails as error message without any changes at all.
This is because Rails knows to check for a specific string prefix, and
if the error message has that prefix then the error is slated to be
displayed in the web interface. This means that we as Gitaly _must not_
change the error string at all.
Of course this has repeatedly been broken in the past, and is broken
right now once again. And that's not really surprising: no component of
the stack should rely on parsing exact error messages. Furthermore, any
error messages logged by Gitaly are ultimately not pointing to the real
location in our code, but may be an arbitrary user-controlled string and
thus ultimately not helpful.
Introduce a new detailed error type `CustomHookError` that allows us to
instead pass both standard output and standard error to the caller
without having to make any compromises on the error messages we
generate. Having this explicit calling interface makes it easy to see
for the caller what they may expect, it's hard for us to break, and also
extremely helps with regards to discoverability.
Changelog: added
|
|
We have roughly three sections in our Makefile:
1. The section declaring all variables and build options.
2. The section declaring phony targets like `build` and `install`.
3. The section declaring actual recipes to generate the artifacts.
While we mostly abide by these sections, the recipes to build our Go
binaries are intermixed into the phony-targets-section.
Move them down so that they're together with all the other recipes.
|
|
Back when we added support for GNU build IDs to our binaries we started
building our Go binaries twice: the first time we do it so that we can
derive a deterministic GNU build ID from the Go build ID, and the second
time to embed that derived GNU build ID into the final binary. This has
two problems:
1. We build the binary twice, and even though Go caches most of the
build process this still significantly slows down incremental
builds of our binaries.
2. We're rebuilding the binary in-place by overwriting the binary
with no GNU build ID with the one that contains the GNU build ID.
While I'm not a 100% sure, this seems to leads to issues from
time to time where the resulting Go binary may be invalid when
the build got cancelled at the wrong point in time. This then
broke subsequent rebuilds of the binary.
Ideally, we wouldn't have to care about generating a deterministic GNU
build ID at all. But unfortunately, the only part of Go's build infra
that supports them is `go build`, so we have no easy way to avoid the
rebuild.
Instead, we can use a very ugly workaround though: when building the
binary, we embed a fixed GNU build ID with a known string and put this
binary into an intermediate location. We now derive the GNU build ID
from that intermediate binary, but instead of rebuilding it we simply
replace the known GNU build ID with the derived GNU build ID. Like this
we don't have to rebuild the binary but still get the same end result as
before.
This is implemented via a new naive Go tool that does this replacement
for us. Note that we cannot use e.g. sed(1) for this, and we don't want
to start depending on new tools like xxd(1). The tool is simple enough
though and allows us to have some additional safeguards to verify that
we are unlikely to wreak havoc on the binary.
|