Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
Issue: https://gitlab.com/gitlab-org/gitaly/-/issues/4335
|
|
linguist: Implement Stats in pure Go
See merge request gitlab-org/gitaly!4580
|
|
git: Fix commit-graph corruption caused by corrected committer dates
Closes #4327 and gitlab#365903
See merge request gitlab-org/gitaly!4677
|
|
testhelper: Replace `testhelper.ModifyEnvironment()`
See merge request gitlab-org/gitaly!4685
|
|
In staging systems we have observed corruption in commit-graphs with the
following error message:
fatal: commit-graph requires overflow generation data but has none
This bug is caused by the rollout of Git v2.36.0, which has fixed a set
of bugs with reading corrected committer dates in commit-graphs [1].
Unfortunately, these fixes surface a corruption in commit-graphs that
can happen when upgrading a commit-graph written by Git v2.35.0 with a
Git version of v2.36.0 or later with `--changed-paths` enabled [2].
Disable use of corrected committer dates for now. Due to the bug that
existed in Git v2.35.0 and earlier we haven't ever read them anyway, so
this is not a performance regression for us. Instead, we'll continue to
use topological generation numbers to still speed up certain queries.
We should reenable them when the bug has been fixed upstream.
[1]: http://public-inbox.org/git/pull.1163.git.1645735117.gitgitgadget@gmail.com/
[2]: https://public-inbox.org/git/DD88D523-0ECA-4474-9AA5-1D4A431E532A@wfchandler.org/
Changelog: fixed
|
|
[ci skip]
|
|
Add an option to skip flat paths for tree entries
See merge request gitlab-org/gitaly!4693
|
|
To make sure we're not breaking things when we'll switch to go-enry for
the language detection, compare the known languages of the linguist gem
with the Go package.
|
|
The go-enry package is based of github-linguist v7.20.0. To be able to
compare the set of languages they both know, we update the Gem to match
the version where the go package got the mustard from.
|
|
The main reason we wrote the Go implementation for linguist is getting
rid of Ruby. But we don't want this implementation to be slower. So to
compare the performance, this change adds a benchmark of both
implementations.
These are the results running it on my computer:
$ go test -run=^$ -bench=. -benchtime=4x ./internal/gitaly/linguist
goos: linux
goarch: amd64
pkg: gitlab.com/gitlab-org/gitaly/v15/internal/gitaly/linguist
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
BenchmarkInstance_Stats/go_language_stats=false/from_scratch-8 4 56371415917 ns/op
BenchmarkInstance_Stats/go_language_stats=false/incremental-8 4 25310716778 ns/op
BenchmarkInstance_Stats/go_language_stats=true/from_scratch-8 4 3149285756 ns/op
BenchmarkInstance_Stats/go_language_stats=true/incremental-8 4 1402539266 ns/op
Getting the stats from scratch drops from roughly 56s to 3.1s, which is
impressive. Getting stats incrementally, drops from about 31s to 1.4s,
which is also pretty nice.
On the topic of cache size, these are the file size for both
implementations:
* Ruby: 473993 bytes
* Golang: 436335 bytes
These are comparable in size and it's a relief to see it's not
significantly larger in the new implementation.
|
|
This change adds an alternative implementation of linguist.Stats using
go-enry as a pure Go solution. The code is behind a default disabled
feature flag 'go_language_stats'.
Issue: https://gitlab.com/gitlab-org/gitaly/-/issues/2571
Changelog: performance
|
|
We're about to introduce an implementation of getting the language
statistics in Go. For this we need a way to collect and store the
results in a cache. This cache will be used to incrementally calculate the
stats between commits.
This change introduces the linguist.languageStats struct which will deal
with all this.
|
|
In some cases you might only have access to a localrepo.Repo instance,
and directly to the locator. To make it more convenient to find the
TempDir for the storage where the repo is on, add a helper method to
Repo that will use it's storage.Locator to determine the temporary dir
for the storage.
It also ensures this temp dir exists.
|
|
This change adds a function to get a RevisionIterator from the output of
git-ls-tree. This iterator will loop through all files that are
reachable from the given revision.
|
|
This change adds a function to get a RevisionIterator from the output of
git-diff-tree. This iterator will loop through all the new objects that
has been introduced between the two given revisions.
|
|
We're about to add an alternative implementation for the Stats method,
written in Go, and for that we need a few different things. This change
prepares for that.
|
|
It was a little bit hard to debug the incorrectness of the expected
languages and their attributes. With this change we use
`testhelper.ProtoEqual` to compare the expected with the actual language
statistics. This will provide a better error message when there is a
mismatch.
|
|
We're about to make some changes in the handling of the CommitLanguages
rpc, and because I noticed this RPC did not have any documentation, I've
decided to add it.
|
|
Populating flat paths for large trees may be expensive
Let's add an option to skip it (to preserve the default behaviour)
in order to be able to avoid this expensive operation when it's
not necessary
Changelog: added
|
|
'master'
testcfg: Fix building binaries as unprivileged user with Go 1.18+
See merge request gitlab-org/gitaly!4689
|
|
featureflag: Remove raw feature flags
See merge request gitlab-org/gitaly!4675
|
|
Makefile: Add target to debug go tests in Delve
See merge request gitlab-org/gitaly!4662
|
|
ci: Upgrade build images to most recent ones
See merge request gitlab-org/gitaly!4688
|
|
We have changed the Postgres client version due to our update to a more
recent GitLab Build Image, which caused some minor changes in the
Praefect schema. Update the schema to match.
|
|
Our CI pipelines execute tests as unprivileged user so that we can
verify that all tests pass without any additional capabilities and that
tests don't write any data into the source tree. This broke for our
nightly jobs though with the recent update to Go 1.18 because we now
fail to build the auxiliary Go binaries.
The root cause is a combination of two things:
- Go 1.18 started to query the repository for VCS information so
that it can embed that information into the resulting Go binaries.
- Git has addressed CVE-2022-24765 and won't open repositories by
default anymore that aren't owned by the current user.
So when testing with recent Go and Git versions as unprivileged user
then Go will try to extract the VCS information, but Git will refuse to
operate.
Fix this by declaring the source directory as "safe". While this is not
necessarily the case, it doesn't compromise our security any more than
before: if an adversary was able to write to the `.git/config` file,
then the very same adversary would also able to just touch up the source
code of Gitaly itself. So that adversary could obtain arbitrary code
execution by just changing whatever is executed as part of our tests.
|
|
Allow manually executing nightly jobs that test against Git's `master`
and `next` branches to easily allow testing changes against these Git
versions.
|
|
go: Update module gocloud.dev to v0.25.0
See merge request gitlab-org/gitaly!4683
|
|
The GitLab Build Images created for Gitaly have been simplified to
neither include Postgres nor PgBouncer given that both are not used by
our CI pipelines anymore. Furthermore, while not really important to us,
the Git version was bumped to v2.36.0.
Update our default build image to switch to the most recent ones.
|
|
Remove the `testhelper.ModifyEnvironment()` helper. One should instead
either use `t.Setenv()` or `testhelper.Unsetenv()`.
|
|
Rewrite callers of `testhelper.ModifyEnviroment()`, which is about to be
removed in favor of either `t.Setenv()` or `testhelper.Unsetenv()`.
|
|
The tests for our environment helpers aren't quite matching our modern
best practices. Furthermore, they're missing tests for the case where
environment variables are not set at all.
Refactor these tests to adapt. While at it, convert them to use the new
`t.Setenv()` and `testhelper.Unsetenv()` helper functions.
|
|
With Go 1.17 a new `t.Setenv()` helper was introduced that automatically
restores the previous environment variable and that verifies that the
current test is not labelled as parallel. We're about to migrate all
callers to use it instead of `testhelper.ModifyEnvironment()`.
One part that `testhelper.ModifyEnvironment()` does though is to unset
an environment variable in case the given value is the empty string. And
unfortunately, Go didn't introduce a `t.Unsetenv()` helper at the same
time.
Implement a new function `testhelper.Unsetenv()` that behaves the same
as `t.Setenv()`, except that it unsets the environment variable.
|
|
|
|
go: Update module google.golang.org/grpc to v1.47.0
See merge request gitlab-org/gitaly!4651
|
|
repository: Verify behaviour when fetching into pooled repos
Closes #4304
See merge request gitlab-org/gitaly!4671
|
|
gittest: Convert most tests that use worktrees
See merge request gitlab-org/gitaly!4666
|
|
Adjust infrastructure to use Go v1.17
See merge request gitlab-org/gitaly!4684
|
|
The difference between raw and non-raw feature flags has been really
confusing at times: while a non-raw flag has the format "some_a", a raw
flag has the format "gitaly-feature-some-a". "Raw" in this context thus
refers to the metadata as used in the gRPC context.
To avoid this confusion and make the difference more distinguished and
to improve type safety the preceding commits have replaced all usage of
raw feature flags with actual `FeatureFlag` structures. In case one
needs access to the raw flag one would thus now use `MetadataKey()` to
obtain the raw metadata key.
Hopefully, this should clarify things when we use `FeatureFlags` in all
of our codebase in contrast to using raw ones in some places and non-raw
ones in others.
|
|
We're currently injecting raw feature flags into the hooks payload so
that these flags can be set in the outgoing context that's used to
connect back to Gitaly again. We're reducing the use of raw feature
flags though in favor of using the typed `FeatureFlag` structure.
Convert the hooks payload to instead inject `FeatureFlag`s directly with
their respective value so that we can get rid of raw feature flags.
This is technically-seen a breaking change because any old server that
is running would still inject the old set of feature flags which the new
binary doesn't understand anymore. As a consequence, it wouldn't use any
feature flags at all. This is fine though: none of our hook RPCs is
using feature flags right now, so this doesn't change anything.
|
|
When injecting default values for feature flags into the outgoing gRPC
context of Gitaly peers we're using `RawFlags()` to extract currently
set feature flags. This function is going away in favor of the type-safe
`FromContext()` function.
Migrate the coordinator to use `FromContext()`.
|
|
In order to guarantee a persistent view of enabled and disabled feature
flags the coordinator makes sure to inject all feature flags that aren't
explicitly set in the gRPC context with their default flags as seen by
Praefect. This fixes a race where the default of any feature flag may
change during an upgrade that can lead to the flag being default-enabled
on some Gitaly nodes, while it's default-disabled on others.
This mechanism has a bug though: we first convert the incoming context
to an outgoing one and thus retain all metadata. We then compute the
full set of feature flags we wish to inject, including already-set ones.
And then finally amend this metadata to the outgoing context. The end
result is that any feature flag that has been explicitly set in the
incoming context will be set twice in the outgoing context.
Fortunately, this bug is benign because we always use the same value in
both cases, and our feature flagging code only ever checks the first
value. So this bug doesn't cause any adverse effects.
Fix it regardless by only setting feature flags in the outgoing context
that haven't yet been set before.
Changelog: fixed
|
|
Convert the logic to inject and extract feature flags in `gitaly-git2go`
to not use raw feature flags anymore. Instead, use `FromMetadataKey()`
and `FromContext()` to work on typed `FeatureFlag`s.
This prepares for the removal of `featureflag.Raw`.
|
|
The featureflag subcommand is only used for testing purposes to assert
that we can indeed pass feature flags from the `git2go.Executor()` to
the process we're about to spawn. We're using raw feature flags for this
purpose, but we're about to remove them in favor of using the type-safe
`FeatureFlag` structure.
Convert the subcommand to return a new `FeatureFlag` structure that we
can use to more thoroughly check whether things behave as expected and
remove usage of `RawFromContext()` to prepare for its removal.
Note that while this is a breaking change to the calling interface of
the `gitaly-git2go` executable, this command is only ever used in our
tests. So in practice, this doesn't matter.
|
|
Extract the logic to construct a feature flag from its gRPC metadata key
from `FromContext()` into a new `FromMetadataKey()` function. This will
be used at a later point.
|
|
To extract feature flags from the current context you can call the
`AllFlags()` function. Just from looking at that function's name it's
not clear though what it returns. As it turns out, it returns the flag
as plain strings in the format of "${flag_name}:${flag_value}". This
format is quite inaccessible to callers given that they'd have to
manually split name and value and then also know to parse the value.
This makes for an easy-to-misuse interface.
Replace `AllFlags()` with a new type-safe `FromContext()` function that
returns a map of `FeatureFlag`s mapped to their respective value. Like
this it's immediately clear what you obtain and enjoy all the comfort of
using the various functions on the feature flag itself.
Refactor callers to use this new function.
|
|
Feature flags must follow a specific format that is of the form
`some_feature_flag`. Most importantly, flags must not contain some
special characters:
- They must not contain dashes because we need to replace dashes
with underscores and the other way round when converting between
the gRPC metadata key and the actual feature flag name. So if the
name contained a dash, then the name wouldn't round-trip.
- The must not contain colons because we use colons to separate the
feature flag name from its value.
Disallow creating feature flags like this and convert tests that violate
this rule.
|