Age | Commit message (Collapse) | Author |
|
There is a bug that prevents the NUL-byte being outputted for some
variations of `--format`. So instead, since ref names cannot have spaces
anyway, switch to using spaces.
|
|
Previously ref parsing was done in place. This replaces the parsing with
a shared implementation.
To enable the git output to be decoded with the same code the format was
changed to a consistent NUL byte separator. `PEELED` had to be
lower-cased as git was interpreting %00PEELED as an escape code.
|
|
Previously ref parsing was done in place. This replaces the parsing with
a shared implementation.
|
|
|
|
In anticipation of interpreting more reference listings from git, we
rename the constructor to be specific to the type of output that is
being decoded.
Most other refs output use a null-byte as a separator. So we make the
initial changes to make this separator configurable.
|
|
RefsDecoder decodes the output from git-show-ref. There is nothing
specific to backups here and moving it to the git package should allow
us to reuse the logic elsewhere.
|
|
[ci skip]
|
|
objectpool: Convert to use in-memory fetches
See merge request gitlab-org/gitaly!4038
|
|
Adds a description of how backups work
See merge request gitlab-org/gitaly!3924
|
|
|
|
|
|
testcfg: Try to fix ETXTBSY errors with gitaly-git2go executable
Closes #3869
See merge request gitlab-org/gitaly!4041
|
|
git: Do not install templates when creating repos
Closes #3285
See merge request gitlab-org/gitaly!4027
|
|
Update relative path in SetGeneration
Closes #3876
See merge request gitlab-org/gitaly!4019
|
|
catfile: Stop creating decorrelated spans
See merge request gitlab-org/gitaly!4042
|
|
supervisor: Fix bugs related to timeouts
See merge request gitlab-org/gitaly!4029
|
|
Reconciler is scheduling update jobs to replicas that are pending
renames. This began happening when we updated reconciler to join
the records by repository id in 2b19d8c5. As the renames previously
disjointed the records in `repositories` and `storage_repositories`
due to the `relative_path` changing, the reconciler didn't target
replicas that were pending rename. This commit changes the query to
only use replicas which are not pending a rename to act as a source
node for replication. Combined with a previous commit to update the
relative path in SetGeneration if necessary, this will restore the
same behavior the reconciler had earlier with renames; never using
replicas pending rename as a source but possibly creating a new
replica.
|
|
GetConsistentStorages has a regression where a replica pending rename
is considered up to date. Requests routed to the replica would fail as
the replica is still stored at the old path. Previously the records were
joined by the (virtual_storage, relative_path) which naturally considered
the rename pending replicas outdated. Until the relative_path is completely
removed from storage_repositories, this commit updates the query to ensure
the replica's relative path is the same as the repository's. This indicates
the rename operation has been applied.
|
|
Praefect previously allowed multiple replica records of a single
repository on a storage. This was possible if the repository was renamed,
as the primary key was the repository's virtual storage and relative path.
When the repository was renamed, the replica records that were not yet
renamed were considered to be a different repository. Usually this disparity
would be fixed by the rename replication job Praefect would schedule on renaming.
If the job failed, the repository would be left in it's old path. The reconciler
would eventually replicate the repository to the new path from another Gitaly
node.
Praefect is now using repository IDs and is enforcing that each Gitaly node has
a single copy of a repository due to the uniqueness of the repository ID. This
causes problems with the above flow as the replica left at the old path would have
the same repository ID as the new replica reconciler is replicating to. This is
causing problems on staging where some repositories had only some of the replicas
renamed and the reconciler is now facing unique constraint issues when attempting
to reconcile the repository to the new location. After the migration to unique
relative paths is completed, this won't really be an issue anymore as we won't be
moving the repositories on disks anymore and the renames are done atomically in the
database. For now though, Praefect needs to workaround this by updating the
replicas record to point to the relative path the replication job contained.
|
|
praefect: Fix inconsistencies in praefect sub-commands
See merge request gitlab-org/gitaly!4026
|
|
The only implementation of the Remote interface, which is part of the
localrepo interface, isn't used by anything anymore, and in fact we
don't want to gain any new users either: on-disk remotes are not used in
Gitaly anymore.
Remove both the implementation and its interface.
|
|
We already fetch tags into object pool via the refspec we pass to
git-fetch(1), which include tags. In addition though we also fetch tags
into the standard tag namespace "refs/tags/", which is simply wrong.
Fix this bug by passing `--no-tags` to git-fetch(1) such that we only
use our refspecs to fetch tags.
Changelog: fixed
|
|
There is only a single callsite left in our codebase which creates
on-disk remotes, which is the `FetchIntoObjectPool` RPC. We do not need
this remote for anything though, and we have in the past made the move
to only ever use in-memory remotes.
Convert the implementation to fetch from the repository path directly
without creating a remote first. Note that this implicitly fixes a bug:
when creating a remote, Git sets up a default fetchspec which we use in
addition to the explicit fetchspec we pass to git-fetch(1), and as a
result we create two refs for each branch we fetch. Now that we don't
create a remote anymore this bug is fixed for good.
Changelog: fixed
|
|
When fetching into an object pool, then we accidentally end up with
duplicated branches and tags because of how refspecs are set up.
Demonstrate these bugs by adding a new testcase.
|
|
Our build helpers which build commands on the fly as tests require them
first build the executable in a shared place and then copy them into the
per-test binary directory. In our CI, we have occasionally seen errors
when trying to execute the resulting binaries though, where execution
always fails with ETXTBSY. This error happens when trying to exec into
an executable that's been concurrently modified. This is quite a curious
failure mode: we only ever build commands once due to using `sync.Once`,
and the copied file is also closed before returning. So in theory, there
shouldn't be anything that's still modifying the file.
While digging a bit I stumbled over [1] by chance: as it turns out,
there used to be an issue with Docker where executing a file that's just
been chmod'ed will cause ETXTBSY. And that's exactly what we're doing:
we copy binaries into place and chmod them afterwards such that they're
executable. The root cause of this seems to be an issue with the overlay
filesystem in use, which handled this inconsistently from how it's
handled with a "normal" filesystem.
Let's try to fix the bug by hardlinking the file into place instead.
[1]: https://github.com/moby/moby/issues/9547
|
|
[ci skip]
|
|
[ci skip]
|
|
Processes spawned by the catfile cache are potentially decorrelated from
the main context such that these processes can be kept around after the
RPC context has been cancelled for later reuse. In order to make these
cached processes observable, we create all tracing spans for requests to
the catfile process both in the current per-RPC context and in the
decorrelated context hosted by the cache.
We have since observed that creating traces is comparatively expensive
for us given that it creates many allocations and also has quite some
runtime overhead. Doing this work twice per request only makes this
problem worse for us. While one way to fix this is the ongoing work to
allow batching requests, another angle to tackle this problem is to only
start creating one context.
Naturally, we want to keep the per-RPC spans such that one can easily
see what's going on in a given RPC. The information about how the
catfile process is used globally across different RPC calls isn't really
all that useful in the first place: while it gives a global picture, we
wouldn't ever see any activity outside of the context of an RPC anyway.
Drop support for creating decorrelated spans in the context of the
cache. This not only simplifies the code, but also optimizes away some
allocations and gives us a slight speedup. Before this change:
BenchmarkListAllBlobs/ListAllBlobs-16 8 133269398 ns/op 155884456 B/op 27954 allocs/op
BenchmarkListAllCommits/ListAllCommits-16 40 43940502 ns/op 9231545 B/op 32434 allocs/op
BenchmarkFindAllTags/FindAllTags-16 6 185349615 ns/op 31240846 B/op 134990 allocs/op
After this change:
BenchmarkListAllBlobs/ListAllBlobs-16 8 130457695 ns/op 155643219 B/op 25795 allocs/op
BenchmarkListAllCommits/ListAllCommits-16 39 40942865 ns/op 9171181 B/op 30244 allocs/op
BenchmarkFindAllTags/FindAllTags-16 6 172061555 ns/op 30800354 B/op 122930 allocs/op
We have roughly 10% less memory allocations and a slight speedup.
Changelog: performance
|
|
When we supervise a running process, then regularly poll consumed memory
by the process such that the user of the supervisor can kill the process
if it blows up too much. The intent is to do this once every 15 seconds,
but because we create the timer ticker before we have actually received
information about the process to be monitored it can cause us to poll
the process twice in a row when it takes more than 15 seconds to get
that information.
Fix the issue by resetting the timer before doing the select such that
we always wait for 15 seconds. Note that this also converts us to use a
`helper.TimerTicker`: the benefit is that its reset function knows to
drain already-pending events.
Changelog: fixed
|
|
When a supervised process is currently running, we have two different
timeouts on which we are listening: one to reset the crash counter when
the command has been running long enough, and one to re-send the "up"
notification to any potential listeners which is required because
sending the notification is done with a timeout. Given that we set up
both timeouts in the same `select` call it means that they are naturally
clashing with each other: if the crash reset timer is configured to be
shorter than the notification timer, then notifications won't ever be
resent. Otherwise, if the crash reset timer is configured to be longer
than the notification timer, then we won't ever reset the crash timer.
And if they're the same, then both timers race with each other.
Fix the issue by creating timer tickers instead which persist across
loops.
Changelog: fixed
|
|
Resetting `time.Timer`s is quite tricky in cases where the tick hasn't
been consumed by the caller given that one first has to stop the ticker,
then consume the event in case there is any and finally the ticker can
be reenabled by calling reset. Given that this is not quite obvious
without taking a closer look at `time.Timer`'s documentation this is
easy enough to miss.
Adjust our own `Reset()` wrapper in the helper package to do this dance
such that `Reset()` will reset the timer in all cases.
|
|
objectpool: Drop UnlinkRepostioryFromObjectPool RPC
See merge request gitlab-org/gitaly!4037
|
|
catfile: Refactor tracing to allow for batching
See merge request gitlab-org/gitaly!4031
|
|
git2go: Fix failure on MacOS X systems
See merge request gitlab-org/gitaly!4039
|
|
Because of the issue in the Go v1.17
https://github.com/golang/go/issues/46763
the git2go dependent tests was failing the
execution with fatal error. Update of the
golang.org/x/sys to newer version fixes this
problem.
|
|
The UnlikRepostioryFromObjectPool RPC is not used by anything anymore,
and the implementation from it is dangerous given that it doesn't
actually unlink a repository from its object pool: it only tries to
remove a remote named after the pool's project path, which we wouldn't
ever create in the first place. The RPC has thus been deprecated in
release v14.3.
Remove the RPC and its backing code.
Changelog: removed
|
|
Given that we now don't install templates when creating repos anymore in
our production code, we shouldn't do so either within our tests. Change
the gittest helper to also set "init.templateDir=" to adapt and fix
tests as required.
|
|
When executing git-init(1) or git-clone(1), then Git will first create
the target repository and then copy all files found in a specific
template directory into the target repository. The default templates,
which are typically installed in "/usr/share/git-core/templates",
contain example hooks, a default description and an exclude file which
explains the syntax of excludes. None of these files are required at
all, and they take up precious disk space without serving any immediate
purpose.
Furthermore, an upcoming change will give us the ability to distribute
Git versions as two executables, only. As a consequence, in such a setup
we wouldn't have any template directory available, which would lead Git
to complain about it not having been found.
Stop installing the template directory by injecting "init.templateDir="
into both git-init(1) and git-clone(1) to prepare for this change and
stop wasting disk space.
Changelog: changed
|
|
Back in the days, we used to set up Git hooks by creating a symlink to
our directory containing the actual hook scripts. This has long be
replaced by a much more robust mechanism which injects "core.hooksPath"
into Git processes to point to that directory instead. Since then, we
don't really need to setup hooks on-disk anymore, except if an admin
wants to create per-project server-side hooks.
Back when we used to create the symlink though we had to make sure that
all repos which got newly initialized did in fact have the symlink. To
do so, we used to call `CreateRepository()` after having set up repos,
which created the symlink for us. Given that the call is idempotent, it
was totally fine if the repository was already initialized: git-init(1)
simply re-copies the templates into the repository and creates anything
that's missing.
Nowadays this call is superfluous though: we have just created the repo,
we don't want to set up a symlink and thus it ultimately doesn't serve
any purpose at all anymore. So let's get rid of this useless step and
drop calls to `CreateRepository()` in `CreateRepositoryFromBundle()`,
`CreateRepositoryFromURL()` and `CreateFork()` and start to assert that
the hook directory is a directory, like it would be created by Git.
Note that this also removes the transactional voting step as performed
by CreateRepository. The voting in there is broken anyway given that it
doesn't guarantee any atomicity anyway. This is being remodelled in
another MR in the future which updates creation of repos to become
atomic, and dropping the CreateRepository call now goes into the same
direction.
|
|
In the catfile cache, we need to discern processes which are cached and
those which aren't cached, e.g. because they are not created in a
cacheable context. So while a catfile process always has two different
contexts with the per-RPC context and the decorrelated context stored in
the cache, these contexts may indeed be the same. But in fact, our
tracing infrastructure in that package will always create tracing
spans for both contexts, even if they're the same, which thus leads to
double-accounting of each request.
Fix the bug by only creating the secondary span in case it's different
from the primary per-RPC context.
|
|
When using the catfile interface, then each request to it will both
create an opentracing span and increment a counter. In total, this
results in quite a bunch of small short-lived allocations, where we have
observed in production that across all RPCs, 12% of all of our
allocations where done by opentracing. While useful, this cost is simply
too big to warrant us keeping it.
We're about to refactor the catfile interface to allow for more
efficient I/O patterns by using batched requests and buffering requests
to the process by using a request queue. Like this we optimize how we
write to the process so we have less syscalls and thus less context
switches. In this world, we can also refactor tracing: instead of
creating one span per requested object, we can instead create a single
span across the whole lifetime of this batched request.
Naturally, we cannot log as much details as we do right now, where each
span contains the revision we did request. Instead, we can tag the spans
with how many objects of each type we have requested in total. While
less detailed, this should be a good-enough tradeoff such that the spans
don't loose all of their benefits.
Refactor the catfile tracing architecture to prepare for this change and
to allow for batching of traces by storing per-request-type counters in
a new `trace` structure. The following benchmarks without and with this
change show a small reduction in allocations of about 2-3%, which
probably stems from the fact that we're not sending the revision
anymore:
BenchmarkListAllBlobs/ListAllBlobs-16 8 128331259 ns/op 155930660 B/op 28539 allocs/op
BenchmarkListAllCommits/ListAllCommits-16 39 37256301 ns/op 9203049 B/op 33162 allocs/op
BenchmarkFindAllTags/FindAllTags-16 6 172584078 ns/op 30899380 B/op 138956 allocs/op
BenchmarkListAllBlobs/ListAllBlobs-16 8 127551661 ns/op 155772826 B/op 27936 allocs/op
BenchmarkListAllCommits/ListAllCommits-16 38 38921473 ns/op 9223457 B/op 32429 allocs/op
BenchmarkFindAllTags/FindAllTags-16 6 195437102 ns/op 30978034 B/op 135009 allocs/op
So while this change is not really worth it on its own, it does prepare
us to more readily refactor the catfile cache to batch requests while
retaining useful tracing info.
Changelog: performance
|
|
When creating a new tracing span in the catfile package, then we need to
create those spans for two different contexts: the usual per-RPC context
and the decorrelated context that's stored as part of the cache. In
order to associate spans created in the decorrelated context with their
originating RPC, we tag the span with the correlation ID of the per-RPC
context. But because we pass arguments to `startSpan()` in the wrong
order, we accidentally did it the other way round: the per-RPC context
ends up with the correlation ID of the cached context.
Fix this bug by switching the order in which we pass the contexts.
|
|
git: Prepare tests using direct Git invocations for bundled Git
See merge request gitlab-org/gitaly!4028
|
|
The SSH upload-pack tests execute Git directly. This will stop working
as soon as we migrate to use a bundled version of Git given that we
wouldn't be able to locate git-upload-pack(1) anymore.
Convert the tests to instead use a `git.ExecCommandFactory()` to prepare
the clone commands and execute them to be prepared for this change.
|
|
The test HTTP server set up by the gittest package is using `exec.Cmd`
to spawn a Git executable directly. This will stop working when we
migrate to a bundled Git version because we'd be unable to locate the
git-http-backend helper.
The HTTP server requires the ability to set up stdout of the command
such that it can serve as a CGI handler. Extend the `gittest.ExecOpts`
helper to support setup of stdout and convert the HTTP server setup to
use it.
|
|
One of our gitaly-hooks tests which exercises the pack-objects hook is
using `os.Exec()` directly to spawn a Git command. This will stop
working when we migrate to using a bundled Git version given that we
wouldn't be able to locate Git's exec prefix anymore.
While this testcase is a good candidate to use `gittest.ExecOpts()`,
what is still missing is the ability to pass additional environment
variables to the spawned command. Extend the config struct to allow for
this and convert the test.
|
|
The `gittest.ExecStream()` helper takes an additional reader which gets
used as the spawned command's standard input. In order to make more use
of this helper though we'll need to add additional arguments, which
would eventually become quite unwieldy to use.
Convert the function to instead take the reader via a config struct such
that it's more readily extensible.
|
|
The testcase for cloning from a redirecting server uses `exec.Cmd` to
spawn the Git executable directly. While this works now, it will stop
working as soon as we bundle Git as a binary given that we wouldn't be
able to find the git-remote-http helper anymore.
Prepare for this change by using a `git.ExecCommandFactory` instead to
spawn the command.
|
|
Some of our tests in the server authentication tests are running a
Gitaly server with an empty config, which is not a valid configuration.
Fix this by using the testcfg package to build a valid config for us.
|
|
A test for the pack-objects hook is writing logs into Gitaly's source
tree. This is something we don't want to do nowadays, where all tests
should instead use the temporary test directory to create their data.
Convert the test to store the log in a `testhelper.TempDir()` instead.
|