Age | Commit message (Collapse) | Author |
|
To bring `gitaly-bench` closer to feature parity with the existing
Ansible benchmarking script, start capturing Gitaly's logs and write
them to `<OUT_DIR>/gitaly.log`.
We use `journalctl` with the poorly documented `_PID` option to do this.
This requires us to capture the logs _before_ stopping Gitaly.
|
|
Add the new `gitaly-bench` binary for coordinating starting and stopping
Gitaly for benchmarking.
The tests at this level do not attempt to verify anything other than the
basic CLI output as `gitaly-bench` requires a benchmarking environment
to run successfully. In particular we need to have a `gitaly` service
available in `systemd` for invocations to succeed.
|
|
Add the `client` subcommand to `gitaly-bench`, which is run on the
host that will run benchmarking requests against Gitaly. This sends
JSON-encoded commands to the `Coordinator` via TCP, ordering it to start
and stop Gitaly for benchmarking.
Details of the rpcs to benchmark and the specific queries to run are
stored on-disk and located at runtime.
|
|
Currently we use Ansible to run benchmarks against Gitaly, but this
requires us to run benchmarks serially. Ansible cannot divide a group of
tasks between hosts, so we're limited to a single Gitaly host. The goal
of this tool is to enable us to create `n` Gitaly hosts and run
benchmarks in parallel against them.
This initial commit creates the `Coordinator`, which is run on the
Gitaly hosts being benchmarked and is responsible for starting and
stopping Gitaly when commanded by the client machine. The `Coordinator`
listens on a separate TCP port from Gitaly, running until an error
occurs or told to exit by the client.
In the future the ability to capture Gitaly's logs and run profiling
will be added.
|
|
[ci skip]
|
|
statushandler: Don't lose original error when converting error codes
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5532
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
Easier major version upgrades
Closes #4894
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5505
Merged-by: James Fargher <proglottis@gmail.com>
Approved-by: Sami Hiltunen <shiltunen@gitlab.com>
Reviewed-by: Sami Hiltunen <shiltunen@gitlab.com>
Co-authored-by: James Fargher <jfargher@gitlab.com>
|
|
gittest: Use the actual gitaly default branch when creating repositories
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5480
Merged-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Sami Hiltunen <shiltunen@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Reviewed-by: Sami Hiltunen <shiltunen@gitlab.com>
Co-authored-by: James Fargher <jfargher@gitlab.com>
|
|
Add gRPC load balancing documentation
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5498
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Evan Read <eread@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Reviewed-by: Evan Read <eread@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: Justin Tobler <jtobler@gitlab.com>
|
|
gitaly/config: Support generating configuration via external command
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5525
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Sami Hiltunen <shiltunen@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Sami Hiltunen <shiltunen@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
In the context of FIPS we need a mechanism that allows us to not store
any credentials in plaintext format on-disk. Ideally, we would hook into
a solution like AWS Secrets or Kubernetes Secrets that lets us retrieve
the secrets at startup time so that they don't have to exist on disk in
any form at all. But there exists a bunch of different solutions for
secrets management, where many make sense in one context but don't in
any other contexts.
So implementing support for e.g. AWS Secrets directly would put us into
a position where we now need to argue which solutions we want to support
and which we don't. This shows that directly supporting any of them
directly is not the right way to go, but that we should instead support
a general schema that allows administrators to easily plug in whatever
they are using without too much of a hassle.
A simple and boring solution to this is to implement support for running
an external command on startup that generates the secrets for us. Like
this, an administrator can simply provide a script that runs e.g. `aws
get-secret-value` and Gitaly would know to put the secrets into the
correct spot in our configuration. But that again has another issue:
there are several different fields that may contain secrets, and making
sure they all can be adjusted might be tedious.
To fix that, we simply generalize the idea: why only care about secrets,
when we can instead provide a mechanism that allows the external command
to generate the complete configuration which we then merge back into the
initial configuration read from disk?
The implementation of this is straight-forward: we read the initial
configuration from disk, and when a specific key is set we use its value
as the executable that is to be spawned. We expect the executable to
generate JSON, which we then deserialize and thus merge into the config
structure we have already parsed. Like this, the command is free to only
generate configuration at runtime that it wants to actually change, and
all the other configuration would stay as-is.
A simple external executable generating secrets via the AWS command line
client could thus look like this:
#!/usr/bin/env ruby
require 'json'
JSON.generate({"gitlab": {"http_settings": {"password": `aws get-secret-value --secret-id ...`}}})
But theoretically-speaking, you can generate the complete configuration
of Gitaly.
Changelog: added
|
|
When serializing fields in the TOML format then we will not omit fields
that are empty. This causes us e.g. convert a `nil` array of values into
an empty array of values, which may or may not have the same semantics.
We will need to really only print what matters in two different
contexts:
- We will introduce a new command that writes the effective Gitaly
configuration to its standard output. This can be done to verify
whether the actual configuration matches your expectations,
including generated default values. By omitting empty fields here
we will only print what's relevant.
- We will introduce tests that need to serialize the data structure
to a reader in the TOML format and then round-trip it via
`config.Load()`. This does not currently work as expected due to
the above conversion.
Add the `omitempty` keyword to the fields' TOML tags. As there is no
code path righgt now that would ever serialize the configuration this
change cannot have any impact on existing code paths. Add a test that
verifies that an empty configuration round-trips as expected now.
|
|
We're about to introduce a way to generate configuration via an external
process. As generating JSON is more widely supported than generating
TOML in general, we'll use JSON as the serialization format.
Add `json` tags to the configuration fields as a preparatory step.
|
|
Add MinOccurrences cache middleware
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5501
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: Jacob Vosmaer <jacob@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: Jacob Vosmaer <jacob@gitlab.com>
|
|
Inject the MinOccurrences middleware into the streamcache call stack
and make it configurable via the
GITALY_PACK_OBJECTS_CACHE_MIN_OCCURRENCES environment variable.
This is a temporary configuration mechanism that will allow us to
validate the MinOccurrences middleware on GitLab.com quicker. If it
works as expected we will replace the env var with normal config.toml
configuration logic.
|
|
This adds a new streamcache middleware called MinOccurrences. Its
purpose is to only forward requests to the inner cache for keys that
have been seen a minimum number of times.
|
|
repository: Test refactorings and SHA256 object format support
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5508
Merged-by: Pavlo Strokov <pstrokov@gitlab.com>
Approved-by: Pavlo Strokov <pstrokov@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Pavlo Strokov <pstrokov@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
When observing context errors, then the statushandler middlerware
converts the error code to either be `context.DeadlineExceeded` or
`context.Canceled`. This is done by using the "%v" verb, which will
cause us to lose the original error. And if the original error was a
`structerr`, then this can cause us to lose any error metadata attached
to it.
Fix this using "%w" instead and explicitly overriding the gRPC error
code via `WithGRPCCode()`.
Changelog: fixed
|
|
The statushandler middlerware is responsible for converting errors into
either a `DeadlineExceeded` or `Canceled` error when the RPC has been
canceled via the context. This functionality recently broke when we
changed behaviour of the `structerr` package around how gRPC errors are
propagated in 076d3b6bb (Merge branch 'pks-structerr-unknown-by-default'
into 'master', 2023-03-17).
The fact that things could break indicates that we are lacking test
coverage for cases where the RPC implementation returns an error with a
specific gRPC error code which should then be overridden via the context
error. Add tests that would've cought the regression.
|
|
The testcases for the Unary and Stream statushandler middleware are
duplicated. Merge them and use subtests.
|
|
repository: Drop size calculations via git-cat-file(1)
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5511
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
|
|
We have three different ways to calculate the repository size:
- The plain old way of using du(1) on the repository. This is fast,
but also counts in repository metadata, unreachable objects and
duplicate objects.
- The implementation based on git-rev-list(1). It allows to only
count in objects that are reachable via specific references and is
overall the most flexible approach in how we calculate the size.
The downside of this is that it is really slow compared to du(h).
- The implementation based on git-cat-file(1). This has been added
as a compromise between the other two options so that we can at
least account for duplicate objects. It's faster than using
git-rev-list(1), but still slower than du(1).
While the latter two options have been implemented quite a while ago
now, we still haven't managed to roll them out due to performance
reasons. By now it is clear though that we can either choose between
the correct and flexible approach, or the fast and dirty approach. Which
means that in the long run, our target architecture is going to be the
approach based on git-rev-list(1).
This kind of leaves the third implementation based on git-cat-file(1) on
the chopping block. It was implemented after we've seen how slow the
git-rev-list(1) based approach is as a middle ground where we can at
least count out duplicate objects while being reasonably fast. But it
did not deliver on that promise as it still wasn't fast enough without
also changing the overall architecture of how repository sizes are
calculated. So this approach is disfavoured nowadays.
Remove the git-cat-file(1) based implementation. We are not going to use
it anyway.
Changelog: removed
|
|
|
|
go: Update module gocloud.dev to v0.29.0
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5486
Merged-by: John Cai <jcai@gitlab.com>
Approved-by: Pavlo Strokov <pstrokov@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
|
|
go: Update module gitlab.com/gitlab-org/labkit to v1.18.0
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5492
Merged-by: John Cai <jcai@gitlab.com>
Approved-by: James Fargher <proglottis@gmail.com>
Approved-by: John Cai <jcai@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
go: Update golang.org/x/exp digest to 9ff063c
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5458
Merged-by: John Cai <jcai@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
|
|
housekeeping: Keep timestamp of last full repack
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5527
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
statushandler: Don't wrap context errors
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5530
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: Will Chandler <wchandler@gitlab.com>
|
|
When returning a `structerr`, we will check for wrapped errors and
return the inner gRPC code when present. However, with context errors we
want ensure that the `Canceled`/`DeadlineExceeded` statuses override any
previous generated codes.
Wrapping the errors caused elevated error rates on `InfoRefsUploadPack`
as the inner error of canceled processes was `Internal`.
Use '%v' in our format strings for context-related errors so that we
return the expected status.
Changelog: fixed
|
|
repository: Refactor RepositorySize to not rely on du(1)
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5517
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: James Fargher <proglottis@gmail.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
Add a new metric to the housekeeping manager that tracks the time since
the last full repack. This metric will gives us a better picture of how
often we're currently doing full repacks. Equipped with this information
we can eventually come up with a good estimate of how often we should be
doing full repacks in a geometric-repacking world.
|
|
We have started writing the timestamp of the last full repack into
repositories. Let's start tracking it via the git/stats package so that
this information becomes available via logs.
|
|
At the moment, we can keep track of the last time a full repack has
happened in a repository by simply taking the timestamp of the oldest
packfile in a repository. This only really works because we have a split
between full and incremental repacks.
This is about to change soonish though once we implement support for
geometric repacking of repositories. In geometric repacking, we don't
control anymore whether old packfiles will get rewritten or not, but
instead we shift that burden to Git. So the oldest packfile in the
repository may have been written either by a full repack, or by a
geometric repack that decided to rewrite the oldest packfile.
Now why do we care for this? The problem is that once we move towards
geometric repacking, we still need to make sure that we perform a full
repack every once in a while. This full repack will then be responsible
for moving unreachable objects into a separate cruft pack so that we can
still properly prune objects from repositories. But as we will have no
direct control over the number of packfiles in a repository anymore, the
current heuristic that uses the number of existing packfiles won't work
anymore to decide whether we need to perform a full repack or not.
Instead, we'll be moving to a time-based heuristic where we decide to do
a full repack every once in a while, e.g. daily. This will ensure that
we perform geometric repacks most of the time but still move unreachable
objects out into cruft packs on a schedule that is easy to understand
and explain.
As mentioned though, we are not in a position to derive the last time
such a full repack has happened. To fix this, introduce a new timestamp
file ".gitaly-full-repack-timestamp" that we write into the repository
every time we are about to perform a full repack.
Changelog: added
|
|
gitaly: Fix transaction manager test flake
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5469
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Justin Tobler <jtobler@gitlab.com>
|
|
One of the ways to calculate the repository size is to simply invoke the
du(1) executability to calculate the size for us. This does not make a
ton of sense though:
- It adds a dependency on an external program.
- This external program may behave differently depending on the
platform.
- We need to spawn an executable, which can be expensive at times.
- Go already provides us with an easy way to implement the
functionality ourselves via `filepath.WalkDir()`.
Refactor the code to implement the functionality ourselves. It is more
efficient as we don't need to spawn an external process, easier to
understand, has less platform-specific behaviour and has less external
dependencies. Furthermore, it is also faster especially in cases where
the repository is small, where the overhead induced by having to spawn
an external process can be felt.
The following benchmarks demonstrate this:
```
BenchmarkRepositorySize/disk-usage_with_du
BenchmarkRepositorySize/disk-usage_with_du/empty_repository
BenchmarkRepositorySize/disk-usage_with_du/empty_repository-12 552 3110944 ns/op
BenchmarkRepositorySize/disk-usage_with_du/benchmark_repository
BenchmarkRepositorySize/disk-usage_with_du/benchmark_repository-12 1 1389975933 ns/op
BenchmarkRepositorySize/disk-usage_with_walk
BenchmarkRepositorySize/disk-usage_with_walk/empty_repository
BenchmarkRepositorySize/disk-usage_with_walk/empty_repository-12 4200 302115 ns/op
BenchmarkRepositorySize/disk-usage_with_walk/benchmark_repository
BenchmarkRepositorySize/disk-usage_with_walk/benchmark_repository-12 1 1393811464 ns/op
```
As said, especially the case with empty repositories is a lot faster
with the new implementation, but the returns start to diminish once
repositories get bigger. I also very much expect that at some point, the
du(1)-based implementation would again take the lead. But as we strive
to keep repositories well-maintained, they shouldn't have too many files
in the general case anyway. And even if the du(1)-based implementation
was faster at some point, this does not yet take into account global
contention created around the command spawning token that we need to
acquire in order to spawn du(1).
Note that due to different rounding behaviour between du(1) and our own
implementation, we have to increase the size of test data we are writing
in one of our tests.
Changelog: changed
|
|
structerr: Convert `New()` to return `Unknown` error codes
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5521
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
testhelper: Fix certificate being generated for the wrong entity
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5519
Merged-by: karthik nayak <knayak@gitlab.com>
Approved-by: Pavlo Strokov <pstrokov@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
We're about to change the implementation of the disk-usage-based
RepositorySize calculations from spawning du(1) to in-process. Implement
a benchmark so that we can measure the impact of this.
|
|
gitaly: Do not start gitaly-ruby
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5497
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: Toon Claes <toon@gitlab.com>
|
|
Using a hardcoded package path would mean that this has to be manually
updated when the gitaly major version is upgraded.
|
|
These comments will otherwise become out-of-date every time we update
the major version.
|
|
|
|
We use hard coded package paths variously and these paths would
otherwise have to be manually updated when the gitaly major version is
updated. So instead we use reflection to determine what the current
package path is.
|
|
Previously there was a lot of file iteration logic mixed with error
handling. Here we extract a helper so that the logic is clearly
separated.
|
|
git/stats: Fix flaky timings in HTTP push tests
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5524
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
featureflag: Default `tx_restore_custom_hooks` on
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5522
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Co-authored-by: Justin Tobler <jtobler@gitlab.com>
|
|
We're manually plugging together different parts of the repository
service server even though we have a helper function that performs all
of the necessary steps for us already. Convert the tests to use it
instead to simplify the setup.
|
|
Remove the `!gitaly_testa_sha256` build tag so that we can start testing
the CreateFork RPC with the SHA256 object format.
|