Age | Commit message (Collapse) | Author |
|
TransactionManager is currently looking into the quarantine directory
explicitly to see if there are new objects. As we're now walking the
new reference tips in the quarantine directory, we know whether we
need to pack objects or not based on whether we get object IDs back
from the walk. Let's thus simplify and remove the additional check.
As an optimization, we only launch `pack-objects` and `index-pack` if
we receive an object to pack. This avoids launching the commands for
transactions that don't need any new objects from the quarantine to
be committed.
There are minor differences in behavior as visible in tests:
1. All required objects are now verified to exist in the same place in
`verifyObjectDependencies`. Errors change in few test cases to match
as we no longer rely on `update-ref` to error out in order to verify
the objects the references have been pointed to exist.
2. Git has special cased the empty tree OID. Git can read the object from
the object database even if it hasn't been explicitly written there.
This is visible in one of the test cases that points a tag to an empty
tree. `update-ref` previously allowed the update to go through even if
we didn't write out the object as it's considered to always exist. As
we now pack all objects returned from the object walk, we create a pack
that contains the empty tree oid. In practice, this should not have any
impact. If necessary, we can change the behavior to match later. For now,
I've let it be since it's added behavior for little benefit.
|
|
WalkUnreachableObjects has been replaced by WalkObjects. Remove the
now unused method and the error types associated with it.
|
|
The TransactionManager is currently packing all new and unreachable
objects into the packfile that gets committed as part of the
transaction. This is expensive as it does a full graph walk. It also
was only a temporary solution until we have the required functionality
in Git to only pack the new objects, and figure out the existing
objects a transaction depends on existing in the repository.
Git v2.43 includes the necessary functionality in
`rev-list --missing=print`. Previously it failed on missing commits, now
it prints them out. This commit wires the changes into TransactionManager
by doing the following.
When a transaction is about to commit, the TransactionManager does an
object walk in the quarantine directory only. It starts the walk from
the new reference tips in the transaction and the included objects. The
included objects are objects that are not reachable from the references
but we want to commit them into the repository nonetheless. This walks
the new objects in the quarantine and prints all of them out for packing.
As we perform the walk with only the quarantine configured, any objects
that are not in the quarantine are printed as missing objects. These objects
are considered the transaction's dependencies and they are verified to
exist in the repository prior to committing the transaction. The dependency
information can also be later used to prevent concurrent pruning operations
from pruning unreachable objects that are needed by some transactions.
This approach is significantly faster than the earlier approach as it
only walks and packs the new objects in the quarantine, and thus scales by
the size of the change, not by the size of the history.
|
|
The TransactionManager needs to walk objects in the quarantine to
identify which objects to commit with the transaction in a packfile
and which objects that already exist in the repository the transaction
depends on.
This commit adds WalkObjects to facilitate that. It walks the object
graph from given start points and prints out all objects IDs that it
encounters. If applicable, the objects path is also printed out to aid
packing them later.
Objects that are encountered but do not exist are printed out with a '?'
to mark them missing. When the WalkObjects is invoked with just the
quarantine directory without the alternates, this leads to the object walk
identifying the objects from the quarantine that we need to pack with the
missing objects being the transaction's dependencies that we must exist in
the repository already.
'git rev-list --missing=print' has an issue where it errors out if a start
point of a walk does not exist. This is a problem as the heads we feed as
the starting point of the walk may be objects that already exist in the
repository. In that case, they should be treated as the transaction's
dependencies and simply be printed out as missing objects. There is a
workaround in place to do so by checking whether the object exists before
feeding it to the walk. If it doesn't, we simply print it out as missing
after the walk. This allows us to plug in the functionality in
TransactionManager and later just remove the workaround from here once the
issue has been fixed in Git.
|
|
Repository quarantining works by configuring a temporary object directory
via the `GIT_OBJECTS_DIRECTORY` environment variable, and configuring the
actual object directory as an alternate via `GIT_ALTERNATE_OBJECT_DIRECTORIES`.
This ensures that new objects are written into the quarantine but the old
objects are still readable through the alternate.
There are some use cases where we don't want to have the alternate configured.
For example, if we want to list only the new objects in the quarantine, we can
invoke Git without the alternate configured. This way Git only sees the new
objects.
We're soon going to walk the new objects a transaction introduced to pack
them in TransactionManager. To do so, this commit implements a helper method
to get a quarantined localrepo.Repo instance without the alternates configured.
|
|
TransactionManager has a test asserting that the pack file includes
unreachable objects. This was done previously as we didn't have a
good way to pack and hold on to all needed objects to ensure the packs
will always apply.
We've since introduced new functionality in Git that allows us to pack
the objects from the quarantine and figure out at the same time which
objects the quarantined objects depend on in the repository. We'll thus
prevent pruning the unreachable objects that a transaction depended upon
instead of packing them in the pack.
As we're about to change the behavior, remove the test that is about to
be obsoleted.
|
|
[ci skip]
|
|
[ci skip]
|
|
[ci skip]
|
|
|
|
|
|
Unify transaction manager's references assertion
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6592
Merged-by: James Liu <jliu@gitlab.com>
Approved-by: James Liu <jliu@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: James Liu <jliu@gitlab.com>
Co-authored-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
|
|
In some prior commits, we added pack-refs housekeeping task support to
the transaction manager. We introduced a new assertion helper to verify
the content of packed-refs file. That helper reads the content and
compares with the expected packed-refs text. That approach makes the
test setup hard because the expected packed-refs must match the actual
content of packed-refs on disk, line by line.
This commit enhances that helper. It now parses the content of
packed-refs file. The expected value becomes a map of ref name to its
object ID.
|
|
[ci skip]
|
|
[ci skip]
|
|
[ci skip]
|
|
[ci skip]
|
|
[ci skip]
|
|
[ci skip]
|
|
[ci skip]
|
|
|
|
|
|
|
|
'master'
proto: Add structured error proto message definitions for FindCommits grpc
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6589
Merged-by: Eric Ju <eju@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Will Chandler <wchandler@gitlab.com>
Reviewed-by: Sami Hiltunen <shiltunen@gitlab.com>
Reviewed-by: Justin Tobler <jtobler@gitlab.com>
Reviewed-by: Eric Ju <eju@gitlab.com>
|
|
|
|
smarthttp: Fix upload-pack using bundle-URI with missing backup
Closes #5740
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6591
Merged-by: Toon Claes <toon@gitlab.com>
Approved-by: James Liu <jliu@gitlab.com>
Reviewed-by: James Liu <jliu@gitlab.com>
|
|
|
|
featureflag: Remove `AtomicFetchRemote`
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6605
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: James Liu <jliu@gitlab.com>
|
|
When `bundleuri.UploadPackGitConfig()` returns an empty slice, we see
the `PostUploadPackWithSidechannel` RPC fail with `FailedPrecondition`.
The git-upload-pack(1) process started here quits with the error:
fatal: invalid command 'bundle-uri'
Looking at the source code of Git, we see when git-upload-pack(1)
receives the 'bundle-uri' command, it checks if the that capability
should be advertised. To query for this, Git checks if the config
`uploadpack.advertiseBundleURIs` is set to `true`. When Gitaly could not
build a bundle URI to pass to the git-upload-pack(1), this config was
not set, and Git did not accept this command.
For the SmartHTTP protocol Gitaly uses separate processes to handle
'GET info/refs' and 'POST upload-pack', using option `--stateless-rpc`.
Due to this, the server advertised to the client it supports
'bundle-uri' in the 'GET info/refs' response, but in the consecutive
requests we might spawn git-upload-pack(1) without this config, and Git
will not accept the 'bundle-uri' command.
Inject the config `uploadpack.advertiseBundleURIs=true` into all calls
to git-upload-pack(1), even when the bundle-URI cannot be built.
Label: bug::availability
|
|
Add new proto definition for the structured errors of FindCommits.
|
|
Revert "Merge branch 'jliu-track-restored-repos-2' into 'master'"
See merge request https://gitlab.com/gitlab-org/security/gitaly/-/merge_requests/95
Merged-by: Ahmad Tolba <atolba@gitlab.com>
Approved-by: James Fargher <jfargher@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: James Liu <jliu@gitlab.com>
|
|
The `AtomicFetchRemote` feature flag enables atomic fetches for a
repository. Since the feature flag is default enabled, remove the flag.
|
|
This reverts commit 7de65fabe6eaf803771cf0dd5d53dbaf1e628d56, reversing
changes made to 0ffc495be9bb5a8f4ff60764b545443d54210e56.
|
|
Revert "Merge branch 'fix-user-revert-return-structured-error' into 'master'"
See merge request https://gitlab.com/gitlab-org/security/gitaly/-/merge_requests/94
Merged-by: Jenny Kim <yjeankim@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Approved-by: John Cai <jcai@gitlab.com>
Co-authored-by: Justin Tobler <jtobler@gitlab.com>
|
|
This reverts commit 7f019337fb51305215538a793cf80171111b6e29, reversing
changes made to 82dcc6a7a5692244932c4f6d4d92b3fa615aeb85.
|
|
commit: Fix bug in commit signature parsing
See merge request https://gitlab.com/gitlab-org/security/gitaly/-/merge_requests/92
Merged-by: Jenny Kim <yjeankim@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Karthik Nayak <knayak@gitlab.com>
|
|
|
|
'master'
Add pack-refs housekeeping task support to the transaction manager
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6551
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Sami Hiltunen <shiltunen@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
|
|
|
|
ruby: Update to 3.2.2
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6601
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Will Chandler <wchandler@gitlab.com>
|
|
|
|
backup: Only delete repositories missing from a restore
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6579
Merged-by: James Liu <jliu@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: karthik nayak <knayak@gitlab.com>
|
|
|
|
Report stat failures correctly when creating a repository
Closes #5751
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6597
Merged-by: James Liu <jliu@gitlab.com>
Approved-by: Toon Claes <toon@gitlab.com>
Approved-by: James Liu <jliu@gitlab.com>
Co-authored-by: Sami Hiltunen <shiltunen@gitlab.com>
|
|
Update to Ruby v3.2.2 to match GitLab.
|
|
|
|
Default enable CLONE_INTO_CGROUP support
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6598
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Will Chandler <wchandler@gitlab.com>
Reviewed-by: Will Chandler <wchandler@gitlab.com>
Co-authored-by: Sami Hiltunen <shiltunen@gitlab.com>
|
|
|
|
Adds a handler to Praefect to intercept calls to the WalkRepos RPC. The
handler provides an alternate implementation of listing repositories in
a storage, which queries the Praefect DB rather than walking the
filesystem on disk. This is required so when the RPC is invoked via
Praefect, the DB is used as the source of truth rather than a random
Gitaly node.
The only user-facing difference between this and the original
implementation is that the `modification_time` attribute of the response
message is left empty, as this cannot be determined via the DB.
|
|
Now that we've adjusted the restore mechanism to delete individual
repos as needed, this RPC is no longer required. See the following
issues for more context:
- https://gitlab.com/gitlab-org/gitaly/-/issues/5357
- https://gitlab.com/gitlab-org/gitaly/-/issues/5269
Changelog: deprecated
|