Age | Commit message (Collapse) | Author |
|
This commit changes the major version in the package name from v14 to
v15
Updating go.mod & go.sum with new module name v15
Update Makefile to bump major version to v15
Update the gitaly package name in the Makefile. Also update
gitaly-git2go-v14 -> gitaly-git2go-v15. We need to keep
gitaly-git2go-v14 for a release however, for zero downtime upgrades.
This pulls directly from a sha that is v14.
Update package name from v14->v15 for auth, client, cmd, internal packages
This commit changes the package name from v14 to v15 in go and proto
files in the internal, auth, client, cmd packages.
proto: Update major package number in package name
tools: Change major version number in package name from v14 to v15
gitaly-git2go: Change the package name from v14 to v15
update module updater for v15
Update the documentation for the module updater to reflect v15
|
|
Enforce that message fields must have a comment and add a placeholder
for all instances where such a comment is missing.
|
|
Enforce that message definitions must have a comment and add a
placeholder for all instances where such a comment is missing.
|
|
Enforce that RPC definitions must have a comment and add a placeholder
for all instances where such a comment is missing.
|
|
Enforce that services must have a comment and add a placeholder for all
instances where such a comment is missing.
|
|
|
|
One reason why the relatively new `ListBlobs()` RPC cannot yet replace
`ListNewBlobs()` is that the latter also returns paths of found blobs
while the former doesn't.
Add a new `with_paths` field to the `ListBlobsRequest` and return paths
in case they were requested by the caller.
Changelog: added
|
|
Add a new ListAllBlobs RPC which lists all blobs of the repository, no
matter whether they're reachable via any of the references or not. The
design is the same as for ListAllLFSPointers.
|
|
We already have multiple RPCs which request one or more blobs via a
revision:
- `GetBlob()` allows to find a single blob via an object ID.
- `GetBlobs()` allows to find blobs via a set of revisions with
paths. The path is mandatory here, so it's not possible to e.g.
list all blobs of a given non-treeish object.
- `ListNewBlobs()` is a specialized RPC which lists all blobs
introduced with a specific revision which were not yet part of the
repository.
None of these RPCs allow the caller to specify blobs via a set of
revisions. This lack of functionality is hurting us in the Rails access
checks though: we cannot create a batched check determining all new
blobs of all changes at once, but instead need to call `ListNewBlobs()`
for each change separately. Given that it uses `--not --all`, this is
excessively expensive for some repository shapes with lots of
references.
In order to not have this access check scale with `O(refs * changes)`,
this commit introduces a fourth RPC `ListBlobs()`. This RPC takes a set
of revisions and returns all blobs which are reachable via a graph walk
starting from those revisions. In addition to normal revisions, this RPC
also allows the pseudo-revisions "--all" and "--not".
This new RPC deprecates `ListNewBlobs()` given that it's a clear
superset of its provided functionality.
Changelog: added
|
|
The new "v14" version of the Gitaly module is named to match
the next GitLab release. The module versioning is needed in
order to pull gitaly as a dependency in other projects. The
change updates all imports to include v14 version. The go.mod
file was modified as well after go mod tidy execution. And
the changes in dependency licenses are reflected in the NOTICE
file.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
|
|
With the introduction of ListLFSPointers, it was thought that it could
also replace usage of GetLFSPointers. But as it turns out, this is not
easily doable: callers rely on the fact that even if given for example
the object ID of a tree containing LFS pointers, that there is no walk
and thus no result. This cannot be implemented via git-rev-list(1),
given that its "--no-walk" flag just keeps us from walking down commit
parents, but it will still traverse down the root tree of each commit.
So instead of awkwardly trying to somehow retrofit it into
ListLFSPointers, let's just un-deprecate GetLFSPointers.
|
|
The GetAllLFSPointers RPC has been deprecated and all of its callers
converted to ListLFSPointers in v13.11. Given that no users remain, we
can now remove this RPC altogether.
Changelog: removed
|
|
The GetNewLFSPointers RPC has been deprecated and all of its callers
converted to ListLFSPointers in v13.11. Given that no users remain, we
can now remove this RPC altogether.
Changelog: removed
|
|
In order to verify whether a push is allowed or not, we do a call to
Rails' `/allowed` endpoint. This endpoint does multiple checks: next to
determining whether the reference updates are allowed in the first
place, there's also several checks which inspect all new objects which
are part of the push.
One of these checks is the check for LFS pointers. For each push, we get
a call to `GetNewLFSPointers` which computes the set of new objects and
then extracts all new LFS pointers from this set. If any of the new LFS
pointers does not have a corresponding LFS object in the repository,
then we refuse the push.
Computation of new objects can be heavily expensive though, depending on
the repository's size: we need to do a complete graph walk to correctly
determine preexisting objects and new objects. For big repositories with
lots of references and commits, this can take several seconds and in the
most extreme cases lead to context cancellations as the walks exceed the
30 seconds allowed for those checks. The user cannot do anything about
this, except restricting repository size (which we definitely don't want
to recommend) or disabling LFS pointer checks altogether (potentially
compromising repository consistency).
There is one realization to be had though: when doing pushes into git,
git will first accept all objects into a quarantine environment. As
such, there is a single place which contains all new objects which have
been part of the push. So if we'd be able to just single out pushed
objects and check these instead of doing a graph walk, then we'd start
to scale with push size, not with repository size.
There is an easy way to do this via `git cat-file --batch-all-objects`,
which prints out all of the ODB's objects no matter whether reachable or
not. Given that git spawns processes with the main object directory set
to the quarantine environment and the normal object directory part of
the alternative object directories, the only thing we need to do to
single out only pushed objects is to unset the alternative object
directories: `env --unset=GIT_ALTERNATIVE_OBJECT_DIRECTORIES git
cat-file --batch-all-objects`.
A quick benchmark with gitlab-org/gitlab.git shows that this is much
faster. The following tests have been done by pushing into the target
repository which had the LFS pointer checks as pre-receive hook. Output
has been formatted such that it becomes more readable.
# 1000 commits with one change each
$ git push origin master
Benchmark #1: LFS pointers via rev-list
Time (mean ± σ): 554.3 ms ± 20.6 ms [User: 527.5 ms, System: 27.0 ms]
Range (min … max): 521.9 ms … 590.5 ms 10 runs
Benchmark #2: LFS pointers via --batch--all-objects
Time (mean ± σ): 3.8 ms ± 1.6 ms [User: 5.8 ms, System: 2.5 ms]
Range (min … max): 2.4 ms … 23.0 ms 555 runs
Summary
'LFS pointers via --batch--all-objects' ran
145.14 ± 59.30 times faster than 'LFS pointers via rev-list'
# push 100 branches, where each has the same 1000 commits plus one that is different per branch
$ git push origin $(seq -f 'branch-%g' 100)
Benchmark #1: LFS pointers via rev-list
Time (mean ± σ): 620.9 ms ± 7.0 ms [User: 584.8 ms, System: 36.0 ms]
Range (min … max): 613.3 ms … 633.1 ms 10 runs
Benchmark #2: LFS pointers via --batch--all-objects
Time (mean ± σ): 4.4 ms ± 1.6 ms [User: 6.3 ms, System: 3.1 ms]
Range (min … max): 0.2 ms … 26.5 ms 636 runs
Summary
'LFS pointers via --batch--all-objects' ran
140.34 ± 49.49 times faster than 'LFS pointers via rev-list'
# push of unrelated history to emulate lots of objects (pushing Gitaly into the GitLab repo)
$ git push origin gitaly/master:refs/heads/gitaly
Benchmark #1: LFS pointers via rev-list
Time (mean ± σ): 625.5 ms ± 10.1 ms [User: 590.0 ms, System: 35.5 ms]
Range (min … max): 615.3 ms … 651.2 ms 10 runs
Benchmark #2: LFS pointers via --batch--all-objects
Time (mean ± σ): 6.4 ms ± 1.5 ms [User: 7.9 ms, System: 3.8 ms]
Range (min … max): 2.2 ms … 14.8 ms 467 runs
Summary
'LFS pointers via --batch--all-objects' ran
98.11 ± 23.32 times faster than 'LFS pointers via rev-list'
So even for biggish pushes, `--batch-all-objects` is about 50x faster
than doing the graph walk.
In order to allow Rails to make use of this new way of doing things,
this commit implements a new interface `ListAllLFSPointers()`. In
contrast to the existing-but-deprecated `GetAllLFSPointers()` RPC, it
will return all LFS pointers regardless of their reachability. In order
to only make use of quarantined objects, the caller will then have to
modify the `Repository` message to unset alternative oject directories.
|
|
We currently have three different functions to retrieve LFS pointers:
one which checks object IDs directly, one which retrieves all reachable
LFS pointers and a third one which does a limited graph walk. In short,
all three of these search LFS pointers by iterating a set of revisions.
The current interface is thus unnecessarily complex given that it has
three limited ways to do the same thing instead of providing one general
implementation which allows both our own API to be more concise as well
as allowing users of the API to be more flexible.
This commit thus implements a replacement interface `ListLFSPointers`.
Instead of restricting users, we simply accept a set of revisions which
we traverse in order to find LFS pointers. Because `GetNewLFSPointers`
allowed users to restrict the graph walk via a set of negative refs, we
also accept the pseudo-revisions `--not` and `--all`.
With this new and simple interface, we can replace all existing usecases
and thus mark the three other RPCs as deprecated.
|
|
The LFS pointer RPCs are currently undocumented. This commit adds
documentation for all associated functions and their messages.
|
|
Extracting lint-related stuff into separate proto file. It
is required in order to have proper working proto-linter.
Previously it was using compiled files for verification and
it fails in some cases
(https://gitlab.com/gitlab-org/gitaly/-/jobs/459024976).
lint.proto extracted from shared.proto and contains
lint-related declarations. New task `proto-lint` added to
compile source code that is required by `protoc-gen-gitaly`.
`protoc-gen-gitaly` fixed to use proper proto source data.
Regeneration of all proto-related files.
|
|
Instead of setting OID in the RPC method use annotation in the field
(`target_repository` and `additional_repository`). Having only this 2
annotations created a problem with messages that can be either target
or additional repository (for example `ObjectPool`). Those are marked
with `repository` annotation and `target_repository` and
`additional_repository` are used in the parent messages.
Signed-off-by: Mateusz Nowotyński <maxmati4@gmail.com>
Signed-off-by: jramsay <maxmati4@gmail.com>
|
|
|
|
|
|
Since we need a batch finder for tree entries, we can use the GetBlobs
RPC to return any object that is at a revision:path.
|
|
|
|
|