Age | Commit message (Collapse) | Author |
|
To make sure we're not breaking things when we'll switch to go-enry for
the language detection, compare the known languages of the linguist gem
with the Go package.
|
|
The go-enry package is based of github-linguist v7.20.0. To be able to
compare the set of languages they both know, we update the Gem to match
the version where the go package got the mustard from.
|
|
The main reason we wrote the Go implementation for linguist is getting
rid of Ruby. But we don't want this implementation to be slower. So to
compare the performance, this change adds a benchmark of both
implementations.
These are the results running it on my computer:
$ go test -run=^$ -bench=. -benchtime=4x ./internal/gitaly/linguist
goos: linux
goarch: amd64
pkg: gitlab.com/gitlab-org/gitaly/v15/internal/gitaly/linguist
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
BenchmarkInstance_Stats/go_language_stats=false/from_scratch-8 4 56371415917 ns/op
BenchmarkInstance_Stats/go_language_stats=false/incremental-8 4 25310716778 ns/op
BenchmarkInstance_Stats/go_language_stats=true/from_scratch-8 4 3149285756 ns/op
BenchmarkInstance_Stats/go_language_stats=true/incremental-8 4 1402539266 ns/op
Getting the stats from scratch drops from roughly 56s to 3.1s, which is
impressive. Getting stats incrementally, drops from about 31s to 1.4s,
which is also pretty nice.
On the topic of cache size, these are the file size for both
implementations:
* Ruby: 473993 bytes
* Golang: 436335 bytes
These are comparable in size and it's a relief to see it's not
significantly larger in the new implementation.
|
|
This change adds an alternative implementation of linguist.Stats using
go-enry as a pure Go solution. The code is behind a default disabled
feature flag 'go_language_stats'.
Issue: https://gitlab.com/gitlab-org/gitaly/-/issues/2571
Changelog: performance
|
|
We're about to introduce an implementation of getting the language
statistics in Go. For this we need a way to collect and store the
results in a cache. This cache will be used to incrementally calculate the
stats between commits.
This change introduces the linguist.languageStats struct which will deal
with all this.
|
|
In some cases you might only have access to a localrepo.Repo instance,
and directly to the locator. To make it more convenient to find the
TempDir for the storage where the repo is on, add a helper method to
Repo that will use it's storage.Locator to determine the temporary dir
for the storage.
It also ensures this temp dir exists.
|
|
This change adds a function to get a RevisionIterator from the output of
git-ls-tree. This iterator will loop through all files that are
reachable from the given revision.
|
|
This change adds a function to get a RevisionIterator from the output of
git-diff-tree. This iterator will loop through all the new objects that
has been introduced between the two given revisions.
|
|
We're about to add an alternative implementation for the Stats method,
written in Go, and for that we need a few different things. This change
prepares for that.
|
|
It was a little bit hard to debug the incorrectness of the expected
languages and their attributes. With this change we use
`testhelper.ProtoEqual` to compare the expected with the actual language
statistics. This will provide a better error message when there is a
mismatch.
|
|
We're about to make some changes in the handling of the CommitLanguages
rpc, and because I noticed this RPC did not have any documentation, I've
decided to add it.
|
|
featureflags: Require version and rollout issue URL for flags
Closes #4312
See merge request gitlab-org/gitaly!4676
|
|
[ci skip]
|
|
go: Update module github.com/uber/jaeger-client-go to v2.30.0
See merge request gitlab-org/gitaly!4681
|
|
|
|
go: Update module github.com/kelseyhightower/envconfig to v1.4.0
See merge request gitlab-org/gitaly!4649
|
|
When creating a new feature flag it is mandatory to create a feature
flag rollout issue that keeps track of the current status of any flag.
This issue is used both to not forget about rolling out a flag, and also
for the sake of documenting whole rollout process. We have nothing in
our process though that enforces creation of any such issue, and thus it
happens from time to time that we forget to create it.
Add two new parameters to `NewFeatureFlag()` that document the version a
flag has been introduced in and the URL of the issue that's tracking the
rollout. This brings us several benefits:
- It's hard to forget about creating the issue given that you now
have to specify the link when creating the flag.
- Reviewers immediately spot a missing link and can verify that it's
referring to the correct flag. Furthermore, they may review the
rollout issue itself to see whether it documents important
details.
- When checking up on a feature flag after it has been merged it
becomes trivial to find the rollout issue as a developer.
Furthermore, this commit also adds a new version parameter to keep track
of the version a flag has been introduced in. This also makes it easier
for reviewers to see whether a flag can really already be removed or
not.
Adjust all existing feature flag definitions to have this information.
|
|
Add FullPath RPC
See merge request gitlab-org/gitaly!4642
|
|
FullPath reads the path from the repository's gitconfig under the key
"gitlab.fullpath".
|
|
FullPath will allow fetching the value previously set by the SetFullPath
RPC. FullPath assists admins to determine which repo on disk is
associated with which repository on gitlab.
Changelog: added
|
|
|
|
build: drop support for Go 1.16
See merge request gitlab-org/gitaly!4522
|
|
|
|
|
|
[ci skip]
|
|
[ci skip]
|
|
[ci skip]
|
|
'master'
command: Export environment variable to force-enable all feature flags
Closes #4318
See merge request gitlab-org/gitaly!4672
|
|
repository: Fix clone credentials leaking via command line arguments
See merge request gitlab-org/gitaly!4668
|
|
In Rails we're setting an environment variable that force-enables all
feature flags in Gitaly so that we can ensure that newly introduced code
hidden behind any flag is still compatible with the old way of doing
things in Rails. With a recent change though we have started to export
feature flags to the `gitaly-git2go` child process so that it knows
which flags have explicitly been enabled and/or disabled. In combination
both these features result to inconsistent state: we don't export the
environment variable to the child process, and neither do we inject the
force-enabled feature flags. This means that we'll now see all feature
flags as enabled in Gitaly, while we see the default values in the child
process.
Fix this by exporting the environment variable to child processes. It
should only ever be set for tests, so that should be a fine thing to do.
And by exporting it we now have the same and thus consistent view of
overridden feature flags.
Changelog: fixed
|
|
Keep context cancel suppression logic consistent
See merge request gitlab-org/gitaly!4656
|
|
operations: Honor feature flag for structured errors in UserCherryPick
See merge request gitlab-org/gitaly!4669
|
|
The gitaly-git2go command knows to return two different styles of
conflict errors depending on whether the `CherryPickStructuderErrors`
feature flag is set or not. This is done to help zero-downtime upgrades:
we can't yet return the new-style error when the old Gitaly server is
still running.
Because the server controls the binary it shouldn't ever happen that the
binary doesn't see the feature flag while the server does. It's thus
pointless to handle the old-style `HasConflictsError` on the server side
if the flag is enabled: this rather indicates an unexpected programming
error, and we should treat it as such.
Remove special handling of this error condition to clarify this.
|
|
With b253d7c82 (operations: Convert UserCherryPick to return structured
errors, 2022-06-16), we have converted the UserCherryPick RPC to return
structured errors instead of errors that are embedded into a successful
response. While some of the new errors are covered by a feature flag,
this flag currently really only covers the case where the RPC ran into a
merge conflict. All the other newly introduced structured errors don't
honor to the feature flag at all and are returned unconditionally.
This breaks backwards compatibility during zero-downtime upgrades where
Rails may not yet be prepared to handle our new structured errors, and
thus it may start to misbehave when running an old version with a new
Gitaly server. So let's fix this by properly paying attention to the
feature flag so that we continue to return the old-style errors embedded
inside of the successful response in case the flag is disabled.
Changelog: fixed
|
|
On Linux systems, users can by default observe all command line
arguments of processes that may not even be owned by the same user by
inspecting `/proc/${PID}/cmdline`. This means that we mustn't ever put
any credentials in command line arguments given that unprivileged users
can easily sniff them out. Instead, we typically use the environment to
place credentials, which by default is not readable by any other users.
While we do this in most cases, we don't in `CreateRepositoryFromURL()`.
This RPC uses `cloneFromURLCommand()`, which seemingly knows that it's
bad to put credentials on the command line and thus strips them from the
URL already. But afterwards it goes out of its way to put them on the
command line anyway by creating a set of `http.extraHeader` config
options that are passed to git-clone(1) via the `-c` switch.
Fix this by using `git.WithConfigEnv()` instead of `git.WithConfig()`,
which puts the credentials into the environment instead of the command
line.
Changelog: fixed
|
|
Simplify the code flow of `cloneFromURLCommand()` a bit so that things
that are related are located in direct succession. Also, rename the
`pwd` variable to make it harder to confuse with the parent working
directory.
|
|
The test to verify that `cloneFromURLCommand()` behaves as expected is
spawning a git-clone(1) command, but doesn't make sure that this command
terminates as expected. This has resulted in failed pipelines due to the
process leak checker detecting this escaped process.
Fix this flake by killing the process by cancelling the context and then
waiting for the process to return.
|
|
When cloning a repository via `cloneFromURLCommand()`, we strip
credentials from the URL in order to not leak them. The test we have
that verifies that the credentials are not part of the resulting
command's arguments is wrong though: it checks that the array of args
doesn't have the complete user information, but it only checks for an
exact match. What we want to check instead is that none of the args
carries the credentials.
Fix the check by looping through all arguments and verifying that none
of them contain the credentials. Also, rename the username from the
rather generic `"user"` to something that is less likely to clash with
any host information like e.g. paths.
|
|
ci: Fix FIPS jobs failing in forks of Gitaly
See merge request gitlab-org/gitaly!4665
|
|
We have two test cases for `cloneFromURLCommand()` that roughly do the
same thing, except that the way they set up credentials is different.
Merge them into a single, table-driven test.
|
|
With 72497fc37 (ci: Add jobs which exercise Gitaly in FIPS mode,
2022-06-14), we have added a set of jobs which exercise Gitaly in FIPS
mode. These jobs require a special runner that has booted into FIPS mode
itself. Gitaly has been manually assigned such runners, but they are not
generally available for any of Gitaly's forks. Consequentially, trying
to run these jobs in any of our forks will eventually cause them to time
out because no runner could be acquired. And because our first job rule
for the FIPS jobs will automatically run when changes are merged to the
default branch, this causes pipelines to fail in our forks.
Fix this issue by only automatically creating these jobs when run in the
"official" repository. In case we're not running in that repository
we'll instead just create these jobs with a manual trigger.
|
|
We already have helper.SuppressCancellation to suppress the cancel of a
given context and keep context Values and everything else unchanged. So
it is better to reuse that.
Signed-off-by: blanet <moweng.xx@alibaba-inc.com>
|
|
Signed-off-by: blanet <moweng.xx@alibaba-inc.com>
|
|
go: Update Postgres dependencies
See merge request gitlab-org/gitaly!4647
|
|
ssh: Extend tests to emulate client-side failure conditions with sidechannel
See merge request gitlab-org/gitaly!4615
|
|
repository: Always use long-running LFS filter process in GetArchive
Closes #4282
See merge request gitlab-org/gitaly!4661
|
|
operations: Always enable structured errors in UserDeleteBranch
Closes #4286
See merge request gitlab-org/gitaly!4660
|
|
The command package is not only used to spawn Git commands, but may also
spawn any other arbitrary command. It's thus quite misleading that some
of the error messages are mentioninng "GitCommand"s as context.
Reword these errors to be command-agnostic. While at it, let's also use
the `%w` formatter to chain errors together.
|
|
The read monitor is implemented via a Unix pipe, which means that its
file descriptors need to always be closed by us or otherwise we have the
potential for a resource leak. This is done by spawning a Goroutine that
waits for the context to be cancelled. Unfortunately, when the context
is cancelled when spawning a command that is about to use these file
descriptors then we now run into a race: `exec.Cmd.Start()` may just be
about to duplicate those file descriptors to pass them to the child
process while we're in the process of closing them.
Fix this race by cleaning them up inside of the command's finalizer
instead. Like this we ensure that the descriptors are only closed after
we have already reaped the child process.
Changelog: fixed
|
|
We have cases where we need to clean up resources as soon as a command
has finished. Add the ability to set up a finalizer that runs exactly
once after the command has been reaped to make it easier to implement
this in a race-free manner.
|