Age | Commit message (Collapse) | Author |
|
'master'"
This reverts merge request !6455
|
|
global: Replace use of ctxlogrus with injected logger (pt.3)
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6446
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
Bump versions of Go in .tool-versions file
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6455
Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: James Liu <jliu@gitlab.com>
Co-authored-by: Evan Read <eread@gitlab.com>
|
|
|
|
Add GetFileAttributes RPC
Closes #5569 and #5346
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6360
Merged-by: Toon Claes <toon@gitlab.com>
Approved-by: Toon Claes <toon@gitlab.com>
Reviewed-by: Sincheol (David) Kim <dkim@gitlab.com>
Reviewed-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: David Kim <dkim@gitlab.com>
|
|
of attributes for a given set of files at a given revision.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/5346
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/5569
Changelog: added
|
|
Prepare TransactionManager for multi-repo support
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6395
Merged-by: karthik nayak <knayak@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Reviewed-by: Sami Hiltunen <shiltunen@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Sami Hiltunen <shiltunen@gitlab.com>
|
|
into 'master'
tools/protolint: Update module github.com/yoheimuta/protolint to v0.46.1
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6454
Merged-by: karthik nayak <knayak@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: GitLab Renovate Bot <gitlab-bot@gitlab.com>
|
|
TransactionManager's tests are writing quarantined objects always
into the same repository. This is fine so far as the TransactionManager
still works on just a single repository. The transaction now contains
the target repository's relative path. To prepare for operating and
testing on multiple repositories, change the tests to write the
quarantined objects into the transaction's target repository.
|
|
The TransactionManager will soon support multiple repositories in a
partition. While we've removed reliance on a given relative path in
most places, there are still few location left. This commit adds a
relative path field in the transaction in order store the target
repository of a transaction. The relativePath field of the manager
is now only used to set the target repository of the transaction.
Internally the manager would now support running transactions
targeting different repositories in the partition. Its Begin()
method however still doesn't allow for specifying the target
repsoitory but we're now set up to make that possible.
|
|
TransactionManager is currently expecting to operate on a single
repository and is using its relative path as database key. As the
partitions need to support multiple repositories, this will have
to change. This commit uses the partition's ID to construct its
database keys. The state the TransactionManager is currently storing
in the database is partition specific, not repository specific.
|
|
The TransactionManager is currently using the relative path of the
repository it is operating on to build database keys. This won't
work in the near future as we'll have to support multiple repositories
in a partition. Inject the partition ID into the TransactionManager so
we can use it to later build the database keys in a partition specific
manner without relying on a given repository's relative path.
|
|
Possible stale locks covering 'packed-refs' are currently always cleaned
from a given repository in the TransactionManager. As we need to support
multiple repositories in a partition, this commit changes the lock clean
up to take the relative path from the localrepo instance. That way the
clean up doesn't rely on the relativePath field of the manager as we
move towards removing it.
|
|
The TransactionManager is currently applying a log entry always
against the same repository. We're soon about to support multiple
repositories in a partition and we're storing the relative path of
the target repository in the log entry already. This commit changes
ApplyLogEntry to target the application against the relative path
stored in the log entry.
|
|
The TransactionManager will soon support multiple repositories in
a single partition. Right now there are assumptions in multiple places
that the TransactionManager only ever operates on a single repository.
One of these places is the LogEntry where we don't store the repository
these changes should target. Given the changes may target any of the
repositories in the partition, we need to store the relative path of
the repository to specify the target. This commit does so, and includes
the relative path in the log entry when committing it.
|
|
TransactionManager's repository field is a localrepo instance of the
repository the manager is operating on. As we'll have to support
multiple repositories in a partition, we can't rely on such a field
but must build a localrepo instance for the repository we're operating
on. As a step towards that direction, remove the repository field from
TransactionManager and build a localrepo where we need one. Currently
we still use the relativePath stored in the manager but we'll soon pipe
the relative path of the repository with the transaction.
|
|
The TransactionManager will soon learn to operate on multiple
repositories in a partition. It currently contains a repositoryPath
field that is used in various locations. As a step towards multi-repo
support, remove this field and replace it with getAbsolutePath that
can be used to compute a repository's location based on its relative
path.
|
|
TransactionManager currently assumes it operates on a single
repository. This is about to change as partitions need to support
multiple repositories to accommodate for object pools and their
members.
Repository existence checking is one location where this assumption
is made. The repository's existence is determined when initializing
and kept up to date as transactions create and delete the repository.
This becomes more difficult with multiple repositories. We don't want
to check the existence of possibly thousands of repositories in the
partition when initialzing for a read from a single repository. Holding
the state in memory also becomes more complicated.
This commit solves the problem by figuring out whether a repository
exists by checking for it in the filesystem when needed.
This only modifies how this is managed internally in TransactionManager.
External behavior should be unchanged.
|
|
gitaly/config: allow negotiation timeouts to be configured
Closes #5574
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6406
Merged-by: Sami Hiltunen <shiltunen@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Approved-by: Sami Hiltunen <shiltunen@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: James Liu <jliu@gitlab.com>
Co-authored-by: James Liu <jliu@gitlab.com>
|
|
|
|
proto: Add linter rules for comments
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6439
Merged-by: James Liu <jliu@gitlab.com>
Approved-by: James Liu <jliu@gitlab.com>
Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Co-authored-by: Karthik Nayak <knayak@gitlab.com>
|
|
Update template for security-target label
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6436
Merged-by: Steve Abrams <sabrams@gitlab.com>
Approved-by: James Fargher <jfargher@gitlab.com>
|
|
|
|
Currently we don't have any linting rules for comments in proto files.
This has led to various styled being used in the comments and lack of
consistency. Enforce the styling for comments in proto files.
This commit also includes manual changes to comments in proto files to
adhere to these new rules. The changes are lazy in nature, mostly
minimal changes to simply conform to the new rules. This is to reduce
review load and also writing good comments is out of the purpose of this
commit.
|
|
The testproto directory contains proto files used for internal testing,
as such there are no comments here. Let's exclude these files from the
linter for now.
|
|
Convert all comments to small-case using a simple fish shell script:
for field in (cat log | grep -v Enum | grep Field | cut -d " " -f 3 | uniq | cut -d "\"" -f 2)
set -l pascal (echo $field | sed -r 's/(^|_)([a-z])/\U\2/g')
find . -type f -name "*.proto" | xargs sed -i "s/\/\/ $pascal/\/\/ $field/g"
end
This does produce some unwanted changes where fields which were expected
to be captial are reduced to smallcase, but reduces the overall surface
are for manual editing.
|
|
We want to add some linting rules to our proto files, but it seems like
there are a bunch of comments which use the filler line "// This comment
is left unintentionally blank.".
Let's use a python script to replace them with the field, message, rpc,
service or enum name.
Script:
import re
import sys
import os
from glob import glob
result = [y for x in os.walk(".") for y in glob(os.path.join(x[0], '*.proto'))]
for filename in result:
print("doing", filename)
file = open(filename, "r")
lines = file.readlines()
for i, line in enumerate(lines):
if "// This comment is left unintentionally blank." in line:
next_line = lines[i+1]
sub = ""
if re.search("^\s*rpc", next_line):
x = re.findall(r"rpc (\w*)\s?\(", next_line)
if len(x) != 1:
print("error finding rpc name:", next_line, "match:", x)
sys.exit()
sub = x[0]
elif re.search("^\s*message", next_line):
x = re.findall(r"message (\w*)\s?\{", next_line)
if len(x) != 1:
print("error finding message name:", next_line, "match:", x)
sys.exit()
sub = x[0]
elif re.search("^\s*service", next_line):
x = re.findall(r"service (\w*)\s?\{", next_line)
if len(x) != 1:
print("error finding service name:", next_line, "match:", x)
sys.exit()
sub = x[0]
elif re.search("^\s*enum", next_line):
x = re.findall(r"enum (\w*)\s?\{", next_line)
if len(x) != 1:
print("error enum service name:", next_line, "match:", x)
sys.exit()
sub = x[0]
else:
x = re.findall(r"([\w_]*) = \d+", next_line)
if len(x) != 1:
print("error finding field name:", next_line, "match:", x)
sys.exit()
sub = x[0]
lines[i] = line.replace("This comment is left unintentionally blank.", sub + " ...")
file.close()
file = open(filename, "w")
file.writelines(lines)
file.close()
|
|
It's perfectly legit for the `os.FileInfo` that is passed to the
callback function in `filepath.Walk()` to be `nil`. The most common
scenario where this happens is when the file has been concurrently
removed, which is a benign edge case.
Let's drop the error message. It doesn't add any meaningful information
and the logged errors would be benign anyway.
|
|
The ctxlogrus package is going away with the replacement being log
fields extracted from the context via `log.DebugContext()` et al.
Refactor the code to stop using ctxlogrus by injecting a logger and
using the new context-based logging methods.
|
|
The ctxlogrus package is going away with the replacement being log
fields extracted from the context via `log.DebugContext()` et al.
Refactor the code to stop using ctxlogrus by injecting a logger and
using the new context-based logging methods.
|
|
The ctxlogrus package is going away with the replacement being log
fields extracted from the context via `log.DebugContext()` et al.
Refactor the code to stop using ctxlogrus by injecting a logger and
using the new context-based logging methods.
|
|
The ctxlogrus package is going away with the replacement being log
fields extracted from the context via `log.DebugContext()` et al.
Refactor the code to stop using ctxlogrus by injecting a logger and
using the new context-based logging methods.
|
|
The ctxlogrus package is going away with the replacement being log
fields extracted from the context via `log.DebugContext()` et al.
Refactor the code to stop using ctxlogrus by injecting a logger and
using the new context-based logging methods.
|
|
The ctxlogrus package is going away with the replacement being log
fields extracted from the context via `log.DebugContext()` et al.
Refactor the code to stop using ctxlogrus by injecting a logger and
using the new context-based logging methods.
|
|
Fix a typo in a patch_id_test.go
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6434
Merged-by: Christian Couder <chriscool@tuxfamily.org>
Approved-by: Christian Couder <chriscool@tuxfamily.org>
Co-authored-by: David Kim <dkim@gitlab.com>
|
|
into 'master'
grpc/middleware: Absorb functionality of the field extractor into the metadatahandler
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6440
Merged-by: James Liu <jliu@gitlab.com>
Approved-by: James Liu <jliu@gitlab.com>
Reviewed-by: karthik nayak <knayak@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
Fix linting errors detected locally
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6438
Merged-by: James Liu <jliu@gitlab.com>
Approved-by: James Liu <jliu@gitlab.com>
Co-authored-by: Evan Read <eread@gitlab.com>
|
|
'master'
limiter: Integrate adaptive limit to per-RPC limiter
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6418
Merged-by: James Liu <jliu@gitlab.com>
Approved-by: karthik nayak <knayak@gitlab.com>
Approved-by: James Liu <jliu@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: James Liu <jliu@gitlab.com>
Co-authored-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
|
|
Skip restore when a backup is not found
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6433
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: James Fargher <jfargher@gitlab.com>
|
|
git: Speed up git-upload-pack(1) via new bitmap boundary traversal
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6428
Merged-by: Justin Tobler <jtobler@gitlab.com>
Approved-by: Justin Tobler <jtobler@gitlab.com>
Reviewed-by: Justin Tobler <jtobler@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
During packfile negotiation, git-upload-pack(1) needs to figure out
which objects to send to the client side. This is done by separating the
object graph into two sets:
- "Haves" are objects that the client already has available locally
and that the server thus does not need to send.
- "Wants" are objects that the clients wants to fetch.
The set of objects that git-upload-pack(1) needs to send is then the set
of wants with all haves excluded. The naive way to figure this out is to
walk through both of these sets and paint the objects accordingly. This
can be quite expensive though as history and thus the amount of objects
that need to be painted grows.
Git has thus introduced bitmaps which are generated for a subset of the
commits. For any such commit, the bitmap will immediately identify all
objects that are transitively reachable from that specific commit. So as
soon as we hit such a commit that has an associated bitmap, we can abort
the graph walk as we already know transatively reachable objects from
thereon.
The current algorithm that's used by Git does not perform well though in
repositories that have sparse bitmap coverage of commits, which is the
case especially in large monorepositories with many references. The
chance that any such bitmap is going to be beneficial diminishes as it
becomes less likely that we eventually hit the bitmapped objects. The
consequence is that the bitmap walk is oftentimes slower than the normal
walk in such cases.
To address the issue, Git has introduced a new algorithm in commit
b0afdce5da (pack-bitmap.c: use commit boundary during bitmap traversal,
2023-05-08). Instead of biulding up a complete bitmap first, the new
algorithm only starts to build up bitmaps when the boundary between
haves and wants is hit. This reduces the overhead of bitmaps by just
building up bitmap coverage when required.
This new algorithm has the downside that it's not exact anymore and may
thus lead to too many objects being sent over the wire. This is not a
huge problem in general though, as the normal non-bitmapped walk already
has the same property. This is traded in for increased performance
though especially in the case where there are many haves and only a few
wants.
This commit enables the new traversal for git-upload-pack(1), which does
exactly such a large reachability check via `git rev-list --not --all`
in its connectivity check. It's not yet clear whether this will indeed
result in a speedup, but by putting this behind a feature flag we should
be in a good position to test.
[1]: https://github.blog/2023-08-21-highlights-from-git-2-42/#faster-object-traversals-with-bitmaps
|
|
We're about to introduce a feature flag that adds a Git configuration
key to just a single command. We don't really have a nice way to do this
right now though because the per-command options we have don't accept a
context.
Refactor the code so that we have a context available and can make use
of feature flags.
|
|
Merge the functionality that the field extractor provides into our
requestinfohandler logic. This unifies two related functionalities and
will eventually make it easier to move away from the tags mechanism that
is going away in go-grpc-middleware v2.
Add tests to make sure that the requestinfohandler interceptors work as
expected.
|
|
The fieldextractors take the initial gRPC request, extract important
information like the repository scope or the object pool, and then
inject this information into our tags. Like this, the information will
be picked up by our logging infrastructure when used with the ctxlogrus
package.
Part of what we report is related to namespace requests. As we are about
to merge the fieldextractors into our requestinfohandler, we'd have to
also carry over these kinds of requests. The namespace service is not in
use nowadays though anymore and will soon be removed, so porting over
this logic would be a waste of time.
Let's remove the logic and be done with it.
|
|
The way we inject tags into the context is a bit convoluted.
Furthermore, we'll soon need to re-inject tags a second time for
streaming RPCs. Let's prepare for this by moving the logic into a
separate function.
|
|
Convert the function that reports Prometheus metrics a receiver function
of the `requestInfo`. This makes the logic more self-contained as we
have to pass less variables into it and can readily reuse what we store
inside of the struct already anyway.
|
|
We have two different mechanisms to inject information about the current
gRPC call into the context:
- The metadatahandler injects data about the metadata, e.g. the peer
information and whether a call is an accessor or a mutator.
- The field extractor that gets set up via go-grpc-middleware's tags
interceptor and that injects data about the request.
The tagging mechanism used by the field extractor is going away in the
go-grpc-middleware dependency though without replacement. Furthermore,
both functionalities are related to each other and can thus easily share
the same infrastructure. We are thus about to merge them.
As a first step, let's rename the metadatahandler to requestinfohandler
such that it better reflects the new reponsibility when it will start to
handle both gRPC call metadata and request information.
|
|
In order to retain correlation IDs of replication jobs between the time
they are added to the database and the time when they are processed, we
store the correlation ID as part of the database entry. We store this
field generically in the `params` section, which is backend by a JSON
encoded column, where we use key-value pairs.
The way we use the key here is a bit of a mess though, as we reuse the
same key that our metadatahandler uses to inject the correlation ID for
logging purposes. It goes without saying though that it is not a good
idea to conflate a string for representation (logging) and for storing
data consistently (the replication parameters). While the key is in fact
the same right now, it could totally be that it might change for our
logging infrastructure at some point or even go away.
Separate the concerns by introducing a new constant in the `datastore`
package that holds the key we want to use in the database. When logging
data, we now consistently use the key from `correlation.FieldName`,
which is shared across GitLab. And for the database, we consistently use
the newly introduced constant.
|
|
sentryhandler: Refactor skipping of logs to not use tags
See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6424
Merged-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com>
Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com>
Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
|
|
In v2 of grpc-go-middleware, the tags functionality that allowed
middleware to add arbitrary information to the request context has been
removed without replacement. In our sentryhandler though we use this
mechanism in order to mark RPCs that should not be logged via Sentry.
Refactor the code to use our own context key that we inject into the
context via which we track whether logging should be skipped. This is
another step towards the upgrade to grpc-go-middleware v2.
|