Age | Commit message (Collapse) | Author |
|
We have finished the migration to bundled Git v2.35.1.gl1 in v14.10. Due
to concerns with zero-downtime upgrades we couldn't yet remove the old
version though. But now that we have waited for a release we can finally
remove the old version.
Remove the infrastructure to build and install bundled Git v2.33.1.gl3.
Changelog: removed
|
|
The background verifier sets a lease time on a replica when it picks
it up for verification. If the worker dies for some reason, the lease
will remain in place and no other worker will pick up the replica for
verification again until the lease is cleared. The lease itself tells
the maximum time the worker itself would be working on the replica.
After it has been passed, it would be safe for another worker to pick
up the replica for verification again. This commit adds a background
goroutine that periodically releases expired leases so other workers
can take up the work if the original worker failed and did not release
the lease. The 'verificaton_leases' index is added so the query can
efficiently find the replicas with leases acquired to find the stale
ones.
|
|
This commit adds the necessary schema changes for the metadata
background verification. Each replica receives two new columns:
1. 'verified_at' which contains the timestamp of the last successful
verification of the replica. This effectively allows for identifying
replicas that are in need of reverification.
2. 'verification_leased_until' which contains a timestamp until which
a worker has acquired a lease to reverify the repository. This prevents
multiple workers from picking the same repository for reverification at
the same time.
'verification_queue' index is added to index replicas which have not been
acquired by any worker. This allows for efficientl querying replicas that
are in need of reverification later.
Changelog: other
|
|
The current set of Git patches got quite big, and consequentially it's
hard to see which patches belong to what version. Reorder them into a
per-version subdirectory so that the grouping is clear. Furthermore,
this allows us to find all patches via wildcards instead of having to
manually list them in our Makefile.
|
|
|
|
The test-boot script doesn't currently support bundled Git, so it will
only be able to bootstrap a server when a Git distribution is available.
Adapt the script to also support bundled Git so that we can use it in
contexts where it is the only Git execution environment available.
|
|
The rollout of Git v2.35.1.gl1 is currently blocked in production
because we have discovered various repositories where broken references
cause errors with the new version. So right now, the new Git version is
still disabled in all deployments, also because the feature flag is
default-disabled, as well.
This blockade gives us a last chance to sneak in some patches for Git
v2.35.1.gl1 which have landed in `next` meanwhile. Missing this chance
would mean that we'd have to wait a few more releases until we can land
another revision, or otherwise we'd be running with three different Git
versions in production at the same time: the current Git v2.33.1.gl3,
then Git v2.35.1.gl1, and finally the new version we'd be about to
introduce. So let's take this chance and slightly bend the rollout
process of new Git versions.
Apply commits from 60aae8731c (Merge branch 'ps/fetch-mirror-optim' into
next, 2022-03-08) into Git v2.35.1.gl1. These patches optimize fetches
by:
- Patch 1/5 starts to look up up "want" lines via the commit-graph,
resulting in a 7% speedup.
- Patch 2/5 avoids repeated lookups of commits that were only
required when writing to FETCH_HEAD, providing a 8% speedup.
- Patches 3-5/5 start to skip reading packed-refs when searching for
symbolic refs in the context of a fetch. This provides a 13%
speedup.
All benchmarks have been performed in www-gitlab-com.
Changelog: performance
|
|
We're currently still stuck with Git v2.33.1.gl2 because we haven't yet
been able to roll out bundled Git to production, and we do want to make
use of feature flags to switch to newly introduced Git versions. Bundled
Git should soon be landing in production though, so this commit starts
to build and install Git v2.35.1.gl1 such that we're prepared to quickly
roll out bundled Git and then the new Git version.
Note that this also backports two patch series from the upcoming Git
v2.36.0 release:
- Patches 21-26 fix an issue with the files refs backend, which
would execute the reference-transaction hook for references that
were about to be deleted twice: once for the loose ref itself, and
once for the packed backend. This is a performance issue, but also
required us to have a workaround that filters out any reference
transactions which only consisted of deletes.
- Patches 27-34 fix the `--atomic` flag of git-fetch(1) to also
cover deletion of references and backfilling of tags. This has
caused excessive voting via reference transactions in production
because the reference-transaction hook was executed twice for each
reference modified via those mechanisms. It is thus a correctness
fix because we don't commit changes to disk anymore if any of the
other ref updates fails, and a performance fix for mirror fetches
because we perform less voting.
|
|
git: Remove support for the Ruby hooks directory
Closes #4006
See merge request gitlab-org/gitaly!4356
|
|
The test-boot script verifies that Gitaly comes up correctly and is used
as a very basic smoke test in our CI. As part of the bringup it writes
the gitlab-shell secret into the `ruby/git-hooks` directory such that
Gitaly is able to find the secret, or otherwise it would fail to boot.
This directory is going away, so we need to find a better location. And
we do in fact already have one: we can just write it into a subdirectory
of the temporary directory which already contains all the other runtime
data required for a successful boot. Convert the script to do so.
|
|
We have upstreamed two patches for git-fetch(1) which speed up mirror
fetches in repositories with hundreds of millions of references. The
first patch makes better use of the commit-graph when computing common
objects, and the second patch will cause us to skip computing the output
width when fetching with the `--quiet` flag.
The patches were tested with our notorious problem repo `www-gitlab-com`
with speedups of about 25% when doing mirror-fetches.
Backport them into our own Git version now that they have been merged
via 2b331293fb (Merge branch 'ps/fetch-optim-with-commit-graph' into
next, 2022-02-14).
Changelog: performance
|
|
Previously attempting to destroy a cluster would result in the following errors:
```
Error: Error, failed to deleteuser praefect in instance <instance name>
```
```
Error: Error, failed to delete instance because deletion_protection
is set to true. Set it to false to proceed with instance deletion
```
For the `deleteuser` error,
https://github.com/hashicorp/terraform-provider-google/issues/7677 has
added the `ABANDON` flag, which according to
https://github.com/GoogleCloudPlatform/magic-modules/pull/4231:
```
Setting ABANDON allows the resource to be abandoned rather than
deleted. This is useful for Postgres, where users cannot be deleted from
the API if they have been granted SQL roles.
```
For the `delete_protection` error, as described in
https://stackoverflow.com/a/64911997, recent versions of Terraform sets
`delete_protection` on by default. We need to disable this to be able to
destroy this Cloud SQL instance.
Changelog: fixed
|
|
7b20a6045e8d2a25c86633461c03b13353915643 introduced support for SSH
signatures to the `catfile` package. The GetCommitSignatures RPC uses a
different implementation. This implementation is already compatible with
SSH sigantures, but is not tested with them. This change adds tests
which use SSH signatures.
Changelog: other
|
|
Changelog: performance
|
|
Given that our Praefect database schema is seeded by migrations, there
is no single place which gives an overview over the complete schema.
While this is information that can be easily obtained by connecting to a
Praefect database, this workaround doesn't really work across different
Praefect versions given that one would have to re-seed the database for
each version one is about to investigate. This makes it hard to get info
about the current database schema and to compare changes to the schema
between different versions.
Introduce a new script "generate-praefect-schema" to fix this issue. The
script connects to a Postgres server, creates a new database, seeds the
database by executing `praefect sql-migrate` and then dumps the
resulting schema. Furthermore, a new Makefile target automates this and
writes the dump into our `_support` directory.
The intent is that whenever the database changes, the author of those
changes must update the schema in our `_support` folder such that it's
easily possible to diff the schema across different versions in our Git
history.
To ensure that the schema is always up to date, a new CI job is added
which performs the dump and then checks that no changes occurred.
|
|
The Praefect smoke test job sets up the Postgres service on its own,
even though we have the `postgres_definition` which would already go
most of the way of setting it up correctly.
Convert the test to use the shared Postgres service definition instead
to avoid code duplication. The only thing that needs to change for the
definition is that we didn't have a database name defined at all, which
is added as part of this commit.
|
|
demo: Update Cloud SQL database to PostgreSQL 12
See merge request gitlab-org/gitaly!4074
|
|
This commit will cause PostgreSQL to log any database statements taking
longer than 500 ms. This is useful to flag any slow queries running in
Praefect.
|
|
We are shipping and using PostgreSQL 12 on GitLab.com, so let's use this
version to match production.
|
|
The backwards_compatibility CI job runs the tests with the latest
migrations against the test suite from the previous version. While
testing the backwards compatibility of the migrations is important,
the current approach has problems that make it unfeasible in practice.
1. The CI job figures out the previous version by reading the VERSION
file from Gitaly's repo. This version file appears to be unupdated
in Gitaly's master branch. As it is, it points to 14.3.0-rc2 when
in fact it should point to some release candidate of 14.4. This means
the tests are not against the latest release but the one before that.
2. Some assertions the tests make fail with newer migrations even if
everything would be fine with a real setup. Some tests haven't setup
their test state to fully match the production setup, and thus fail
when the newer migrations are applied. In the same fashion, some tests
fail with newer migrations even if they've setup their test state
correctly. They may however make assertions on how the state looks like
after running some code, yet the new migrations cause the state to look
different from the asserted state. As an example, Praefect's deletion
handling was changed in gitlab-org/gitaly!3906. The new code removes all
records of a repository when the repository is deleted. The old code
testing deletion fails as it still expects the records to be in place.
The new code is backwards compatible though and would process deletion
jobs scheduled by the old code.
While testing for backwards compatibility is important, this commit removes
the job as it's not clear how to solve the problems. This unblocks further
fixes from being merged.
|
|
Git v2.31.1 has been released on October 12th with a big range of fixes.
Most relevant to us, it also contains fixes for git-update-ref(1) to
properly flush output when having a conversation with `--stdin`, which
we're currently carrying a custom patch for.
Bump our Git version to v2.31.1 and drop the custom patch. Furthermore,
adjust `FlushesUpdaterefStatus()` to reflect that v2.31.1 now support
flushing.
|
|
We use the test-boot script to determine whether the Gitaly server
successfully boots up with a given Gitaly configuration. This config
currently not contain the path to the Git binary though, which is why it
will instead pick up the system-provided Git binary. This wouldn't
typically be the correct version though, given that we'd likely want to
boot Gitaly with the Git executable produced by `make git`. The result
is that Gitaly would refuse to boot if the system-provided version of
Git does not meet the minimum required version.
Fix the issue by setting the Git binary path.
|
|
With Go 1.16, the ioutil package was deprecated. Replace our usage of
`ioutil.ReadFile()` with `os.ReadFile()` to adapt accordingly.
|
|
Prior to these patches, Git used unbuffered writes for upload-pack ref
advertisements, with one or two write syscalls per ref. In
repositories with many refs that adds up to a lot of syscalls. These
patches add stdio buffering for the ref advertisement writes.
Also see gitlab-com/gl-infra/scalability#1257.
Changelog: performance
|
|
When executing git-update-ref(1) with the `--stdin` flag, then the user
can queue updates and, since e48cf33b61 (update-ref: implement
interactive transaction handling, 2020-04-02), interactively drive the
transaction's state via a set of transactional verbs. This interactivity
is somewhat broken though: while the caller can use these verbs to drive
the transaction's state, the status messages which confirm that a verb
has been processed is not flushed. The caller may thus be left hanging
waiting for the acknowledgement.
This bug keeps us from using those status updates in the updateref
package, which effectively means that we cannot be sure whether the
state transition was successful until we try to commit changes. Add the
baseline for fixing this bug by applying the patch to our Git version.
|
|
For quite some time we're aware of the fact that mirror-fetches into
repositories with many refs are exceedingly slow. Most importantly, this
issue poses problems for our replication strategy where replication jobs
take so much time that replication targets are likely to be out of date
immediately after they have received a replication jobs because the
primary node has received additional mutators while the replication
target was fetching changes.
To address this problem, we have upstreamed a patch series into git.git
which speeds fetches up somewhat. Most importantly, this patch series
optimizes the way git-fetch(1) enumerates refs by making better use of
the commit-graph. The result is that mirror-fetches in the benchmarking
repository gitlab-org/gitlab have been sped up from originally 56s to
25s.
While it is unlikely that this speedup alone will fix our replication
issue, it is definitely an important step towards improving the
situation.
Changelog: performance
|
|
When a user pushes commits into a Git repository, then Git will do
a connectivity check to see whether it has all objects needed to satisfy
the ref updates. This check can be quite expensive depending on how many
refs the target repo has given that it is implemented as `git rev-list
--not --all`, which loads all objects pointed to by preexisting refs. In
repositories like gitlab-org/gitlab with about 2.3M refs, this typically
takes about 8 seconds. But the worst is that the user typically doesn't
even know what's going on given that there is no progress bar being
displayed.
We have thus upstreamed a set of patches which speed up the connectivity
checks. There are two major performance optimizations:
1. We stop sorting inputs in the connectivity check, which gives a
30% speedup.
2. Instead of loading referenced objects via the object database, we
use the commit-graph. This is another 30% speedup.
In total, this improves the time required from nearly 8 to 3 seconds.
Patches have been merged to upstream's "next" branch and are thus likely
going to be part of the next release. Given that this is still a few
months out, this commit backports those patches such that we can reap
the benefits earlier.
Changelog: performance
|
|
Recently, we've hit such a large amount of references in one of our
repos that `FetchSourceBranch()` always times out. The root cause of
this is that Git has to load all commits of the source repo in order to
negotiate refs which are in common with the remote reposiory when we
execute git-fetch(1). This is currently always hitting the object
database up to the point where git-fetch(1) is heavily dominated by
decompressing and parsing objects from disk.
To fix this, I'm currently upstreaming a patch to git.git which
optimizes the lookup to make use of the commit-graph: if a commit is
part of the commit-graph, then we don't need to hit the object database
but can instead parse it via this graph, which is a lot more efficient.
Benchmarks in the repo which is creating the problems for us has shown
that this brings down the time to fetch from 44 seconds to 19 seconds,
which is sufficient to unblock `FetchSourceBranch()` again.
While our process states that we don't apply any patches to Git before
they have hit "next", the problem is significant enough to make an
exception here given that community contributors cannot create merge
requests against gitlab-org/gitlab anymore. Furthermore, the patch
itself is simple enough and has only received positive feedback from
maintainers so far [1].
Apply the patch to Git to speed up git-fetch(1) in such repos with many
refs and unblock community contributors again.
[1]: https://public-inbox.org/git/08519b8ab6f395cffbcd5e530bfba6aaf64241a2.1628085347.git.ps@pks.im/
Changelog: performance
|
|
Commit 8d4884c01 (Makefile: Upgrade Git to v2.32.0, 2021-06-09) removed
usage of patch pack-bitmap-avoid-traversal-of-uninteresting-tag.patch
in the build, but forgot to remove the patch itself.
Let's do this now.
|
|
To have less places where the version of the gitaly module
needs to be changed, the module path is defined on the fly
for 'notice' task.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
|
|
The new "v14" version of the Gitaly module is named to match
the next GitLab release. The module versioning is needed in
order to pull gitaly as a dependency in other projects. The
change updates all imports to include v14 version. The go.mod
file was modified as well after go mod tidy execution. And
the changes in dependency licenses are reflected in the NOTICE
file.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
|
|
The path re-writer is the go script to re-write imports
in the go source code files, proto files and go.mod file.
The script accepts path to the project dir where go.mod file
locates, current module version and desired module version.
Upgrading a module requires re-generating the gRPC stubs
from proto file that is why the code of the path re-writer
script is imported in a new 'upgrade-module' task which covers
that need.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
|
|
Similar to https://gitlab.com/gitlab-org/gitlab/-/merge_requests/62012,
this updates Gitaly's CI tooling for the new changelog workflow.
|
|
Signed-off-by: Takuya Noguchi <takninnovationresearch@gmail.com>
|
|
We're about to drop installing Gitaly binaries into the source
directory, where current users should all use binaries from the build
directory instead. Our test-boot script is still using the old directory
though, so let's switch it over.
|
|
For the test repositories in _build/testrepos a fixed list of refs is
defined. This list will ensure tests remain to succeed, even if upstream
new refs are pushed.
This change adds two more refs which will be used in rebase tests in
future commits.
|
|
With commit eccda331e (blob: Enable use of bitmap indices when searching
LFS pointers, 2021-03-11), we have added a feature flag which allows the
use of bitmap indices to accelerate searching for LFS pointers. As it
turns out, our usage of git-rev-list(1) triggered a pathological edge
case when adding negative tags to its invocation when bitmap indices are
in use. This issue was fixed upstream with commit 540cdc11ad
(pack-bitmap: avoid traversal of objects referenced by uninteresting
tag, 2021-03-22), which has recently been merged to git's `next` branch.
Add the patch such that we can reenable the use of bitmap indices and
add more tests to verify that the since then newly added ListLFSPointers
and ListAllLFSPointers RPCs behave well with bitmap indices.
|
|
_support/test-boot checks that 'gitaly -version' reports the correct
version denoted by the VERSION file. If the versions mismatch, the
error message currently just says 'gitaly -version failed'. This is
not very helpful so this commit improves the situation by actually
printing out the expected and actual content.
One potential source of failure is when a merge request is made from
a fork that does not contain the latest tags, causing `git describe`
to produce different output. This commit also adds a message about this
to help community contributors working from forks.
|
|
This commit removes a few files that were still in the tree, but gone
unused. Changelog generation and updating downstream dependencies is no
longer our teams concern, but done by the Delivery team.
|
|
With multistage deployments we have both old and a new versions
of the praefect service running and sharing the same database.
We must be sure the changes done to the database schema doesn't
affect the old version of the praefect functionality and it
operates without issues.
The new job was added to the pipeline to verify compatibility of
the previous release and latest changes done to schema migration.
The pipeline runs only if a new migration was added/modified.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3338
|
|
The Docker images are generated for developers, but none of the Gitaly
developers use it. In fact, the images were broken for a while. With
this change the Gitaly project will no longer create our own docker
images, but we can still use the CNG images.
|
|
With introduction of database notifications listener we need
to configure direct connection between praefect and postgres
as PgBouncer can't serve LISTEN operation with transaction
scoped configuration.
The change adds praefect instances IPs to the list of allowed IPs
of the Postgres database. So each praefect is allowed to connect
directly to it.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3354
|
|
Now that we have upgraded to Ruby 2.7.2, we can just use bundler v2 and
drop the use of bundler v1.17.3.
|
|
gitlab-shell: Remove vendored files
See merge request gitlab-org/gitaly!2618
|
|
GitLab-Shell used to implement the Git hooks that performed
authorization and authentication. These were ported to Go, and since
61dcdf969231954c06f16a6222a7540460f4b4f0 these are merged, meaning that
the code this change removes isn't ever called. As such, we should
remove it.
Removing these files required updates to the Makefile as some make
targets are no longer needed as the gitlab-shell tests aren't executed
anymore.
|
|
Part of https://gitlab.com/groups/gitlab-org/-/epics/2380
|
|
The terraform install uses `bin/check` from the vendorred `gitlab-shell`
code to check the hooks are configured correctly. By changing this to
`gitaly-hooks` we can prepare for a later situation where `gitlab-shell`
is removed and as such not available.
Currently the `gitaly-hooks check` command executes `gitlab-shell` under
the hood, so the results are identical.
|
|
After having deployed the cluster, the first connection to deployed
machines is currently going to fail because of unknown host keys. Let's
improve this situation by scanning deployed hosts and adding their keys
to the known_hosts file automatically.
|
|
With cluster deployment now being done via Ansible, let's change the
README to reflect this change.
|
|
Right now, creation of the demo cluster is performed via a set of Ruby
scripts. Let's convert them to use Ansible, too, so tasks become
idempotent and we have less of a mixture between Ansible and custom Ruby
logic.
|