Age | Commit message (Collapse) | Author |
|
We have a concurrency limit for critical resource-hungry RPCs. When a
request exceeds the limit, it is rejected with ResourceExhausted status
code. This commit adds `grpc-retry-pushback-ms` header to rejection
responses.
This header is only effective if there is a retry policy configuration
for the RPC. The retry policy must declare MaxAttempts and
RetryableStatusCodes. This pushback duration has higher precedence than
retry settings in the retry policy.
When a client receives this header, it is forced to sleep. As this
feature is supported in gRPC library implementations, clients can't
refuse. It's recommended to add this header to protect most critical,
resource-hungry RPCs, such as UploadPack, PackObject.
Pushback duration is exponential. It is calculated from the previous
retry attempts fo the same request. The retry attempt is extrated from
`grpc-previous-rpc-attempts` request header.
|
|
protoregistry maintains a registry containing the list of RPC methods
and their reflection. protoregistry.methodInfo returns the full name of
a method. This commit adds one more function to get the service name
(including package name) and method name separately.
|
|
localrepo: Revert refactorings of tree entries helpers
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5763
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
|
|
This reverts 319f05939 (Merge branch 'jc/tree-improvements' into
'master', 2023-05-09), which has introduced a bunch of refactorings for
our tree entries helpers in the localrepo package. These refactorings
caused failures in Rails:
1) API::Submodules PUT /projects/:id/repository/submodule/:submodule when authenticated as a developer returns the commit
Failure/Error: expect(response).to have_gitlab_http_status(:ok)
expected the response to have status code :ok but it was 400. The response was: {"message":"13:submodule subcommand: write tree: writing tree: exit status 128."}
# ./spec/requests/api/submodules_spec.rb:72:in `block (4 levels) in <main>'
# ./spec/spec_helper.rb:423:in `block (3 levels) in <top (required)>'
# ./spec/support/sidekiq_middleware.rb:18:in `with_sidekiq_server_middleware'
# ./spec/spec_helper.rb:415:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:411:in `block (3 levels) in <top (required)>'
# ./lib/gitlab/application_context.rb:61:in `with_raw_context'
# ./spec/spec_helper.rb:411:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:242:in `block (2 levels) in <top (required)>'
# ./spec/support/system_exit_detected.rb:7:in `block (2 levels) in <main>'
# ./spec/support/fast_quarantine.rb:39:in `block (2 levels) in <main>'
# ./spec/support/database/prevent_cross_joins.rb:108:in `block (3 levels) in <main>'
# ./spec/support/database/prevent_cross_joins.rb:62:in `with_cross_joins_prevented'
# ./spec/support/database/prevent_cross_joins.rb:108:in `block (2 levels) in <main>'
2) API::Submodules PUT /projects/:id/repository/submodule/:submodule when authenticated as a developer when the submodule name is urlencoded returns the commit
Failure/Error: expect(response).to have_gitlab_http_status(:ok)
expected the response to have status code :ok but it was 400. The response was: {"message":"13:submodule subcommand: write tree: writing tree: exit status 128."}
# ./spec/requests/api/submodules_spec.rb:92:in `block (5 levels) in <main>'
# ./spec/spec_helper.rb:423:in `block (3 levels) in <top (required)>'
# ./spec/support/sidekiq_middleware.rb:18:in `with_sidekiq_server_middleware'
# ./spec/spec_helper.rb:415:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:411:in `block (3 levels) in <top (required)>'
# ./lib/gitlab/application_context.rb:61:in `with_raw_context'
# ./spec/spec_helper.rb:411:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:242:in `block (2 levels) in <top (required)>'
# ./spec/support/system_exit_detected.rb:7:in `block (2 levels) in <main>'
# ./spec/support/fast_quarantine.rb:39:in `block (2 levels) in <main>'
# ./spec/support/database/prevent_cross_joins.rb:108:in `block (3 levels) in <main>'
# ./spec/support/database/prevent_cross_joins.rb:62:in `with_cross_joins_prevented'
# ./spec/support/database/prevent_cross_joins.rb:108:in `block (2 levels) in <main>'
|
|
|
|
|
|
go: Update module golang.org/x/sys to v0.8.0
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5736
Merged-by: Will Chandler <wchandler@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
|
|
into 'master'
Remove pack-objects limiting by user_id and repository
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5728
Merged-by: Will Chandler <wchandler@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Co-authored-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
|
|
|
|
go: Bump module version from v15 to v16
Closes #5085
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5756
Merged-by: karthik nayak <knayak@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
We're about to release Gitaly v16.0. As we've landed a bunch of
previously announced removals it's thus time to bump our Go module
version from v15 to v16.
|
|
|
|
[ci skip]
|
|
[ci skip]
|
|
Implement write-ahead logging for objects
Closes #4790
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5638
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Sami Hiltunen <shiltunen@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Sami Hiltunen <shiltunen@gitlab.com>
|
|
[ci skip]
|
|
|
|
Preparatory refactorings for linking to pool repos in CreateFork
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5718
Merged-by: Pavlo Strokov <pstrokov@gitlab.com>
Approved-by: Pavlo Strokov <pstrokov@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Pavlo Strokov <pstrokov@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
ruby: Remove last pieces of code
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5512
Merged-by: Toon Claes <toon@gitlab.com>
Approved-by: James Fargher <proglottis@gmail.com>
|
|
|
|
cli: Disable help commands to fix test races
Closes #5104
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5747
Merged-by: Toon Claes <toon@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: Toon Claes <toon@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
|
|
Use google.golang.org/grpc/interop/grpc_testing
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5730
Merged-by: Sami Hiltunen <shiltunen@gitlab.com>
Approved-by: Sami Hiltunen <shiltunen@gitlab.com>
Co-authored-by: Mikhail Mazurskiy <mmazurskiy@gitlab.com>
|
|
|
|
housekeeping: Default-enable geometric repacking
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5745
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Christian Couder <chriscool@tuxfamily.org>
|
|
|
|
Dangerfile: Add group::gitaly::git and group::gitaly::cluster
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5639
Merged-by: John Cai <jcai@gitlab.com>
Approved-by: Andras Horvath <ahorvath@gitlab.com>
Reviewed-by: Andras Horvath <ahorvath@gitlab.com>
|
|
|
|
datastore: Refactor consistent storages cache
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5729
Merged-by: Toon Claes <toon@gitlab.com>
Approved-by: Toon Claes <toon@gitlab.com>
Reviewed-by: Toon Claes <toon@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
|
|
localrepo: Refactor and improvements to tree functions
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5642
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: John Cai <jcai@gitlab.com>
|
|
When the lookup of cached consistent storages fails we need to lock the
cache in order to update it. The lock itself is supposed to only span
over a single key that consists of the relative path of the repository
we're about to look up so that concurrent callers can fetch consistent
storages for multiple repositories at once.
The lock is still too coarse though: while we indeed only create the
lock for the relative path, we lock on the level of the virtual
storages. In other words, if the same relative path exists on two
different virtual storages then the lock would noww span across both
virtual storages. This is needlessly broad and not really what the
code intends to do.
Fix this by refactoring the syncer to be per-virtual-storage.
Changelog: performance
|
|
The cache for consistent storages will only fetch consistent storages
when it got an invalidation notification. It thus got two modes: either
it can look up values from the cache, or it needs to fetch them anew.
This logic is split up across three different functions with some code
flow that is not exactly easy to follow as locking code is transgressing
across multiple calling contexts.
Refactor the code to instead be contained in a single function, only.
This makes it easier to follow and makes it more obvious how locking is
supposed to work.
|
|
The cache for consistent storages uses a `cacheValue` type to represent
cached information. This name is too generic to be useful as it only
tells the reader that it has something to do with caching, but it does
not give any hint at what kind of information is actually getting
cached.
Rename the type to `cachedReplicaInfo` to further clarify its intent.
|
|
In v2 of the upstream `golang-lru.Cache` type, the structure was
refactored to make proper usage of generics. And while we have adapted
the code to that change, the result still relies on using `interface{}`
instead of using properly typed values. This decreases type safety and
increases the mental burden when reading the code.
Refactor the cache to instead use a properly data type.
|
|
The repository store interface uses plain maps as sets. Refactor them to
instead use our new set type. This reduces the mental overhead it takes
to first understand what the map is actually representing.
|
|
The idiomatic way to implement sets in Go is to use maps that have an
empty structure as value. While this works alright, it can be hard to
read and doesn't provide a lot of niceties that sets can typically
provide.
With the recent addition of generics though the way generic data
structures can be implemented has been significantly improved. This also
changes what is idiomatic in favor of better encapsulated generic data
types that provide a higher-level abstraction level.
Introduce a new generic set type as a first internal implementation of
such data structures.
|
|
Since WriteTree() operates on trees, change it to Write() and make it a
function for TreeEntry. It will do a depth first walk of all the Entries,
writing new trees as necessary. If a TreeEntry's OID is empty, then
Write() will attempt to write a new tree with all of its Entries.
|
|
Currently, a TreeEntry is a flat data structure in the sense that it
represents a single entry but does not have any children. A git tree
object has children though.
Change this by adding an Entries member to the TreeEntry struct. This
way, we can represent a real tree data structure.
Also, add a GetTree() function that can get an entire tree. We can
consolidate ListEntries() into GetTree().
|
|
All writes must be write-ahead logged prior to being applied to
the repository. Objects are currently not being logged, which can
be a source of inconsistencies and also performance problems. This commit
implements logging support for objects that are needed by a transaction.
Objects can be included in a transaction by storing them in the tranasction's
quarantine directory. When the transaction is being committed, the
TransactionManager computes a pack file from the new reference tips set in
the transaction.
The pack file includes objects that are unreachable from the current
set of references, so it includes both new objects from the quarantine
directory and objects that are already present in the repository but are
unreachable. This ensures that the pack file contains all objects that
are needed to go from the current set of references to the new set of
references after the transaction. This is important as the unreachable
objects needed could be otherwise pruned, leading to the pack file no
longer applying to the repository.
As objects always flow through the log, this also means that only commited
objects end up in the repository. This is an important property for backups.
The repository will get into a consistent state by applying the write-ahead
log. If objects could end up in the repository without being logged, some
logged reference changes could fail once a repository is being recovered
from a snapshot + log as neither the snapshot nor the log would be guaranteed
to include the objects being newly referenced in a log entry.
For replicated setups later, the fact that only committed objects end up
in the repository means that all replicas are guaranteed to have received
the same objects at some point. If objects from failed writes could end up
in the repository, the leader could have a different set of objects from the
replicas due to these objects which are not replicated. As the pack files are
computed to include also unreachable objects, the pack file is guaranteed to
apply on another replica regardless if it has garbage collected the objects.
The pack files will apply even if the unreachable objects are pruned while
they are sitting in the log. However, the current approach is not enough if
there are concurrent transactions there is nothing holding on to old tips of
references the pack file was computed against. This will be fixed in a follow
up by maintaining internal references to the old tips of references until
all dependent pack files have been applied.
The pack file computation is computationally expensive but should be behaviorally
correct. This is an iteration for now that allows us to proceed. We'll later
need to update the approach to a less computationally heavy one, for example
by just packing the quarantined objects and holding internal references to the
objects the pack file depends on in the repository.
|
|
We have recently started to observe frequent failures in our CI due to
racy writes in our subcommand tests:
WARNING: DATA RACE
Write at 0x000005e5d978 by goroutine 1023:
github.com/urfave/cli/v2.(*Command).setup()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/command.go:138 +0xa8e
github.com/urfave/cli/v2.(*Command).Run()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/command.go:151 +0xa5
github.com/urfave/cli/v2.(*Command).Run()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/command.go:267 +0x126b
github.com/urfave/cli/v2.(*App).RunContext()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/app.go:332 +0x11b2
github.com/urfave/cli/v2.(*App).Run()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/app.go:309 +0x6e6
gitlab.com/gitlab-org/gitaly/v15/internal/cli/praefect.TestDatalossSubcommand.func1()
$GOPATH/cli/praefect/subcmd_dataloss_test.go:198 +0x531
testing.tRunner()
/usr/share/go/src/testing/testing.go:1446 +0x216
testing.(*T).Run.func1()
/usr/share/go/src/testing/testing.go:1493 +0x47
Previous write at 0x000005e5d978 by goroutine 1021:
github.com/urfave/cli/v2.(*Command).setup()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/command.go:138 +0xa8e
github.com/urfave/cli/v2.(*Command).Run()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/command.go:151 +0xa5
github.com/urfave/cli/v2.(*Command).Run()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/command.go:267 +0x126b
github.com/urfave/cli/v2.(*App).RunContext()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/app.go:332 +0x11b2
github.com/urfave/cli/v2.(*App).Run()
$GOPATH/pkg/mod/github.com/urfave/cli/v2@v2.25.1/app.go:309 +0x59d
gitlab.com/gitlab-org/gitaly/v15/internal/cli/praefect.TestAcceptDatalossSubcommand.func4()
$GOPATH/cli/praefect/subcmd_accept_dataloss_test.go:156 +0x3f4
testing.tRunner()
/usr/share/go/src/testing/testing.go:1446 +0x216
testing.(*T).Run.func1()
/usr/share/go/src/testing/testing.go:1493 +0x47
The root cause of this is that the `urfave/cli` library adds a "help"
subcommand to the app, commands and subcommands when none is provided by
the developer already. This "help" subcommand is a pointer stored in a
global variable, so all of those structures would point to the same
"help" command. But as the `urfave/cli` library actually modifies all
subcommands in order propagate some options down from the top-level
application to the commands and subcommands the result is that we have
racy writes as soon as we run apps concurrently that have a fallback
"help" subcommand.
The `urfave/cli` library provides two default ways to access help by
either calling `command subcommand help` or by calling `command
subcommand --help`. The latter is likely the more common way to invoke
the help menu, and this is _not_ the one that is causing the racy
access. So let's fix the race by disabling the "help" subcommand in
favor of the `--help` option.
|
|
When a `cli.App` has no reader, writer or error-writer set then the
library will fall back to the standard input, standard output and
standard error streams. As we don't set these streams consistently in
our tests the result is that any output will in fact be logged to those
streams, thus cluttering the output.
Fix this issue by always setting all three variables. While we should
ideally verify that their state matches our expectations, for now we
only use an empty standard input buffer and discard any output.
|
|
We have rolled out geometric repacking into production systems two weeks
ago now. Key observations:
- On a global scale, we see a decrease in time spent optimizing
repositories of 15-30%.
- On a smaller scale, we see a decrease in time spent optimizing
repositories in the gitlab-org and gitlab-com groups of 70-85%.
As geometric repacking is especially thought to be a win in large
monorepositories that have a lot of activity these observations match
the expectations that we have. Our gitlab-org and gitlab-com groups host
two of the most active repositories that are also quite large with the
gitlab and www-gitlab-com repositories, so it is expected that we see a
large win there.
On the global scale the win is expected to be less significant as we
tend to do full repacks more frequently in repositories with less
activity now. But given that we still see a win there this seems to not
be an issue.
Let's default-enable the feature flag so that we can remove it in the
next release.
Changelog: performance
|
|
|
|
'5027-flaky-test-testserver_postuploadpacksidechannel_gitconfigoptions-no_config_options' into 'master'
smarthttp: Skip flaky test in testServerPostUploadPackGitConfigOptions
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5740
Merged-by: karthik nayak <knayak@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
|
|
|
|
operations: Drop unused param repoPath for merge
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5739
Merged-by: John Cai <jcai@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
Co-authored-by: Xing Xin <xingxin.xx@bytedance.com>
|
|
|
|
Cgroups: enable feature flag by default
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5738
Merged-by: John Cai <jcai@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
Co-authored-by: Steve Azzopardi <sazzopardi@gitlab.com>
|