Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
This simplifies streamcache by getting rid of one of its cleanup
goroutines. After this change there is one goroutine left, that does
file walks and deletes cache files. This is so slow that it makes
sense to do it in a goroutine. Kicking keys out of the in memory index
now happens during Fetch calls.
Changelog: other
|
|
Enable trace2 pack-objects metrics for git-pack-objects
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5591
Merged-by: karthik nayak <knayak@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
|
|
In some prior commits, Gitaly allows hooking into pack-objects command
to extract relative metrics. The current implementation enables that
hook for all Git commands, not just git-pack-objects. While it does not
disrupt the activities of other commands, enabling it there is a waste
of resources. It's very fortunate for us that it's behind a feature
flag.
This commit fixes this situation by filtering the sub command that can
enable this hook.
|
|
docs: Remove gitaly-ruby from docs
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5586
Merged-by: Evan Read <eread@gitlab.com>
Approved-by: Evan Read <eread@gitlab.com>
Co-authored-by: Toon Claes <toon@gitlab.com>
|
|
'renovate-tools/golangci-lint/github.com-golangci-golangci-lint-1.x' into 'master'
tools/golangci-lint: Update module github.com/golangci/golangci-lint to v1.52.0
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5566
Merged-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
into 'master'
go: Update module github.com/grpc-ecosystem/go-grpc-middleware to v1.4.0
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5548
Merged-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Christian Couder <chriscool@tuxfamily.org>
Co-authored-by: James Fargher <jfargher@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
storage: Introduce `GetRepoPathConfig`
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5573
Merged-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: Justin Tobler <jtobler@gitlab.com>
|
|
repository: Implement support for SHA256 in the GetArchive RPC
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5580
Merged-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Reviewed-by: Will Chandler <wchandler@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
go: Update module google.golang.org/protobuf to v1.30.0
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5557
Merged-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Toon Claes <toon@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
Trace2: Expose internal pack objects metrics with trace2
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5442
Merged-by: John Cai <jcai@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
|
|
Currently `storage.Locator` specifies both `GetPath` and `GetRepoPath`
for fetching the path of a repository. The only difference between these
two is that `GetRepoPath` also validates that the path to specified
repository is a valid Git directory.
The semantics do not clearly indicate the difference between these
mostly similar functions. Instead, this functionality can be
consolidated and a configuration option added to toggle validation.
This change allows `GetRepoPath` to accept `GetRepoPathOption` arguments
and introduces the `WithRepositoryVerificationSkipped` option. This
option will skip the Git directory validation of the repository path.
|
|
There is no sidecar no more, so no need to have docs about it.
|
|
git/housekeeping: Always enable writing cruft packs
Closes #4829
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5568
Merged-by: Sami Hiltunen <shiltunen@gitlab.com>
Approved-by: Sami Hiltunen <shiltunen@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
go: Update github.com/ProtonMail/go-crypto digest to 9a39f25
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5563
Merged-by: Sami Hiltunen <shiltunen@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Sami Hiltunen <shiltunen@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
featureflag: Remove `tx_restore_custom_hooks`
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5562
Merged-by: Sami Hiltunen <shiltunen@gitlab.com>
Approved-by: Sami Hiltunen <shiltunen@gitlab.com>
Co-authored-by: Justin Tobler <jtobler@gitlab.com>
|
|
In e4a9ded8e (housekeeping: Wire up logic to conditionally generate
cruft packs, 2023-03-03), we have wired up the logic to generate cruft
packs during repository housekeeping. Cruft packs are a rather recent
addition to Git: instead of keeping unreachable objects as loose objects
until they expire and get pruned, they are instead tracked in a separate
packfile. This both reduces the overhead of having to store the loose
objects as Git can make use of improved compression, and it speeds up
access times for objects as Git does not have to search through so many
loose objects anymore.
We have since rolled out this new feature to production successfully. So
let's remove the feature flag and thus unconditionally enable writing
cruft packs.
Changelog: changed
|
|
'master'
repository: Always use walk-based implementation for RepositorySize
Closes #4976
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5569
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
|
|
git/housekeeping: Implement support for geometric repacking
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5559
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
In ac600d02a (repository: Refactor RepositorySize to not rely on du(1),
2023-03-16), we have refactored the RepositorySize RPC to stop shelling
out to du(1) in favor of a new `filepath.Walk()`-based implementation.
We have since rolled out this change to production and fixed an initial
race condition where the new implementation failed when directories are
deleted concurrently.
Since that bug has been fixed things are going smooth with the new
implementation. As expected, the median and P50 dropped given that the
new implementation avoids the overhead of having to spawn a separate
process. On the other hand, P99 increased slightly as expected as the
implementation of du(1) is more efficient. In practice, this tradeoff is
acceptable and even beneficial as the new code both avoids resource
contention around the command spawn queue and is simpler to extend.
Remove the feature flag and unconditionally enable the new
implementation.
Changelog: changed
|
|
golangci-lint: Enable additional linters
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5576
Merged-by: James Fargher <proglottis@gmail.com>
Approved-by: James Fargher <proglottis@gmail.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
Currently the happy path test case for `GetRepoPath` is run apart from
the other test cases. There does not seem to be a good reason for this
so this change refactors `TestConfigLocator_GetRepoPath` to include all
of its cases in a single table.
|
|
Start using the `errname` linter, which enforces idiomatic names for
error types:
- Error types end in `Error`, e.g. `DecodeError`.
- Error variables start with `Err`, e.g. `ErrUnsupportedAlgorithm`.
Let's fully adopt this idiomatic convention by enabling the linter.
Fix code that currently violates these rules.
|
|
Since the introduction of the `t.Setenv()` helpers in Go 1.17 it has
become discourages to use `os.Setenv()` in tests directly. This is
because `t.Setenv()` has additional safeguards that cause the test to
fail if called in a parallel test as setting environment variables is
not thread-safe.
We enforce this rule via a set of custom forbidigo rules. But since
adding those rules, a new `tenv` linter has been added to golangci-lint
that supports exactly that usecase.
Use the new `tenv` linter and drop our own custom rules.
|
|
Ideally, objects in repositories would only ever be contained in a
single packfile so that they can be efficiently searched and served
during normal operations. Unfortunately, writing packfiles becomes more
and more expensive the larger the repository becomes. As a consequence,
packing objects needs to be based on a compromise between keeping the
number of overall packfiles low and keeping the overhead of the repack
itself low.
The current approach implemented by Gitaly is to accept a certain number
of packfiles that scales with the repository size: the larger the repo,
the more packfiles we accept. But once the threshold has been reached,
we perform a full repack that moves all objects into a single packfile.
This heuristic is better than what we had a year ago, but it is still
suboptimal in the context of large monorepositories where the number of
packfiles will grow very fast. We thus still end up repacking such
repositories quite frequently.
To help with this exact scenario, git-repack(1) has gained support for
geometric repacking in Git v2.32.0. With this mode, git-repack(1) will
make sure that packfiles form a geometric sequence: the next-larger
packfile must have at least factor `r` times as many objects as the
previous packfile. If this condition is violated, Git will pick the
smallest set of packfiles that need to be merged in order to restore the
geometric sequence. Like this, we typically only have to merge a much
smaller set of objects into a single packfile compared to the previous
mode where we had to perform a full repack regularly.
Gitaly will soon start to make use of geometric repacking as well. As a
preparatory step, implement the infrastructure in the form of a new
repacking strategy and a bunch of tests to nail down that the new
command really behaves as we expect it to behave.
The new infrastructure is not yet wired up and will thus not get hit via
production traffic.
Changelog: added
|
|
We support different strategies to repack a repository: incremental,
full and full with cruft packs. These are mutually exclusive and work
differently under the hood. We're also about to add a fourth mode with
geometric repacking that makes the infrastructure even more complex.
Let's refactor the code to make explicit that these are different
strategies and to make it easier to add new ones or remove old ones.
|
|
The `GetArchive()` RPC call works with the SHA256 object format now
that `catfile.TreeEntries()` has become object format agnostic. Remove
the build tag to start testing with SHA256.
|
|
We use a hardcoded object hash size when parsing tree entries to decode
the tree entry's object ID. This of course fails to work in repositories
with the SHA256 object format.
Fix the code to instead use the same object hash as is used by the
parsed tree object.
|
|
While both the catfile ObjectContentReader and ObjectInfoReader know
about the object format that is currently in use, that information is
not exposed to the user of those interfaces.
Expose the object format via the `ObjectInfo` structure so that callers
can use this information. Note that we explicitly don't embed the full
`git.ObjectHash` structure, but only the name of the object format. This
is done to keep the size of the structure as small as possible as we
often process thousands of objects via the gitpipe package.
|
|
Deseed remaining tests for GetArchive so that we can start testing with
SHA256.
While at it, stop skipping one of the tests when running with Praefect.
The reason for skipping is that we use an intercepted command factory,
which in turns breaks creating repositories via the CreateRepository
RPC. But that's not true, as we only intercept calls to git-archive(1).
|
|
Merge tests for behaviour of GetArchive with LFS pointers into the
general tests for GetArchive. There is no reason to have them in a
separate test.
|
|
Merge validation tests for the GetArchive RPC into the general tests for
GetArchive. There is no reason to have them in a separate test.
|
|
We're using seed repositories in the GetArchive tests. While discouraged
for SHA256 compatibility, it also means that it's hard to use exact
matches for the contents of the generated archive as the list of files
is comparatively large.
Refactor the test to generate test data at runtime. This gives us full
control over the actual files and thus allows us to convert the tests to
do exact matches for expected archive contents.
|
|
In our GetArchive tests we're executing tar(1) and unzip(1) in order to
verify contents of the generated archive. This is inflexible as we don't
want to parse their outputs and creates a dependency on system-provided
binaries.
Refactor the `compressedFileContents()` helper function to instead use
Go's own packages to parse the archives. This allows us to easily have
it return a map of paths and their respective contents, which we'll use
in a subsequent commit to make the tests stricter.
|
|
The testcases for the GetArchive RPC replicate the structure of the
request itself only to fill in the request with that self-implemented
structure. This is quite roundabout, duplicates logic and is not as
flexible as if we just had the request in the testcase itself.
Refactor the code to expect a GetArchiveRequest in the testcases instead
of replicating its structure.
|
|
Refactor the GetArchive tests to be better parallelized. While at it,
move the repository format to the outer loop. This will be used by
subsequent refactorings so that we can easily move the complete request
definition into the testcases.
|
|
go: Update golang.org/x/exp digest to 10a5072
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5536
Merged-by: karthik nayak <knayak@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: Toon Claes <toon@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
Provide means to open a transaction through TransactionManager
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5503
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Sami Hiltunen <shiltunen@gitlab.com>
Reviewed-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: Sami Hiltunen <shiltunen@gitlab.com>
|
|
repository: Define and implement new RepositoryInfo RPC
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5556
Merged-by: karthik nayak <knayak@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
Bump protoc-gen-go to v1.30.0
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5575
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
|
|
go: Fix incompatibility of gopsutil dependency with macOS
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5570
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Lukas 'Eipi' Eipert <leipert@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
|
|
This commit integrates Trace2 tracing expoter from some prior commits to
Git comand factory. This hook is controlled by feature flag
`gitaly_export_trace2_pack_objects_metrics`. After this flag is on, this
hook is unlikely to affect the overall performance because the rate of
this command is not high.
|
|
This commit implement trace2 hook that exports pack-object metrics. This
hook uses the existing trace2 infrastructure. It hooks into the
trace2 tree for the following events:
- data:pack-objects:write_pack_file/wrote
- data:pack-objects:loosen_unused_packed_objects/loosened
- data:pack-objects:stdin_packs_found
- data:pack-objects:stdin_packs_hints
- pack-objects:enumerate-objects
- pack-objects:prepare-pack
- pack-objects:write-pack-file
When run into such traces, the hook increases the corresponding
Prometheus' histograms. In addition, it also dumps equivalent fields to
command stats. Eventually, Gitaly outputs logs having associated fields.
|
|
Dangerfile: Add group labels
Closes #5026
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5574
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Co-authored-by: John Cai <jcai@gitlab.com>
|
|
We have added Protobuf definitions for the new RepositoryInfo RPC in the
preceding commit. This commit provides the implementation.
|
|
Over the last few releases, we have tried to adapt the RepositorySize
RPC to encode the policy of what exactly repository size is. This has
proven to not be a great design though: the definition of repository
sizes is constantly changing and furthermore depends on the actual
usecase at hand.
We have thus decided to change our approach: instead of providing a
one-size-fits-all RPC in the form of RepositorySize, we instead provide
an RPC call that provides more detailed information about a repository.
Like this, callers can decide by themselves what the repository size
should actually include.
This commit introduces the definitions for the new RepositoryInfo RPC
call. Right now, it only includes the most important information:
- The total repository size.
- The number of loose references and size of the packed references
file.
- The size of objects split up into different categories: recent,
stale and kept.
This is sufficient in order to implement all the usecases in the context
of repository size calculations that we know of right now, but may get
extended in the future to include more information. Eventually, it can
also be used to expose a dashboard that provides various bits of info to
the administrator in order to increase visibility of the actual on-disk
state.
Changelog: added
|
|
The documentation we have in place for both the count and the size of
packfiles mentions that this counter also includes stale packfiles. This
behaviour is also the intended one so that callers can easily tell how
big all packfiles together are even when additional categories of them
are added over time. Furthermore, this also matches the way we count
loose objects.
But even though it's quite clear that this is the intended behaviour it
is not the actual behaviour: neither cruft packs nor kept packs are
counted into the overall number of packfiles.
Fix this inconsistency.
Changelog: fixed
|