Age | Commit message (Collapse) | Author |
|
Changelog: performance
|
|
Given that our Praefect database schema is seeded by migrations, there
is no single place which gives an overview over the complete schema.
While this is information that can be easily obtained by connecting to a
Praefect database, this workaround doesn't really work across different
Praefect versions given that one would have to re-seed the database for
each version one is about to investigate. This makes it hard to get info
about the current database schema and to compare changes to the schema
between different versions.
Introduce a new script "generate-praefect-schema" to fix this issue. The
script connects to a Postgres server, creates a new database, seeds the
database by executing `praefect sql-migrate` and then dumps the
resulting schema. Furthermore, a new Makefile target automates this and
writes the dump into our `_support` directory.
The intent is that whenever the database changes, the author of those
changes must update the schema in our `_support` folder such that it's
easily possible to diff the schema across different versions in our Git
history.
To ensure that the schema is always up to date, a new CI job is added
which performs the dump and then checks that no changes occurred.
|
|
The Praefect smoke test job sets up the Postgres service on its own,
even though we have the `postgres_definition` which would already go
most of the way of setting it up correctly.
Convert the test to use the shared Postgres service definition instead
to avoid code duplication. The only thing that needs to change for the
definition is that we didn't have a database name defined at all, which
is added as part of this commit.
|
|
demo: Update Cloud SQL database to PostgreSQL 12
See merge request gitlab-org/gitaly!4074
|
|
This commit will cause PostgreSQL to log any database statements taking
longer than 500 ms. This is useful to flag any slow queries running in
Praefect.
|
|
We are shipping and using PostgreSQL 12 on GitLab.com, so let's use this
version to match production.
|
|
The backwards_compatibility CI job runs the tests with the latest
migrations against the test suite from the previous version. While
testing the backwards compatibility of the migrations is important,
the current approach has problems that make it unfeasible in practice.
1. The CI job figures out the previous version by reading the VERSION
file from Gitaly's repo. This version file appears to be unupdated
in Gitaly's master branch. As it is, it points to 14.3.0-rc2 when
in fact it should point to some release candidate of 14.4. This means
the tests are not against the latest release but the one before that.
2. Some assertions the tests make fail with newer migrations even if
everything would be fine with a real setup. Some tests haven't setup
their test state to fully match the production setup, and thus fail
when the newer migrations are applied. In the same fashion, some tests
fail with newer migrations even if they've setup their test state
correctly. They may however make assertions on how the state looks like
after running some code, yet the new migrations cause the state to look
different from the asserted state. As an example, Praefect's deletion
handling was changed in gitlab-org/gitaly!3906. The new code removes all
records of a repository when the repository is deleted. The old code
testing deletion fails as it still expects the records to be in place.
The new code is backwards compatible though and would process deletion
jobs scheduled by the old code.
While testing for backwards compatibility is important, this commit removes
the job as it's not clear how to solve the problems. This unblocks further
fixes from being merged.
|
|
Git v2.31.1 has been released on October 12th with a big range of fixes.
Most relevant to us, it also contains fixes for git-update-ref(1) to
properly flush output when having a conversation with `--stdin`, which
we're currently carrying a custom patch for.
Bump our Git version to v2.31.1 and drop the custom patch. Furthermore,
adjust `FlushesUpdaterefStatus()` to reflect that v2.31.1 now support
flushing.
|
|
We use the test-boot script to determine whether the Gitaly server
successfully boots up with a given Gitaly configuration. This config
currently not contain the path to the Git binary though, which is why it
will instead pick up the system-provided Git binary. This wouldn't
typically be the correct version though, given that we'd likely want to
boot Gitaly with the Git executable produced by `make git`. The result
is that Gitaly would refuse to boot if the system-provided version of
Git does not meet the minimum required version.
Fix the issue by setting the Git binary path.
|
|
With Go 1.16, the ioutil package was deprecated. Replace our usage of
`ioutil.ReadFile()` with `os.ReadFile()` to adapt accordingly.
|
|
Prior to these patches, Git used unbuffered writes for upload-pack ref
advertisements, with one or two write syscalls per ref. In
repositories with many refs that adds up to a lot of syscalls. These
patches add stdio buffering for the ref advertisement writes.
Also see gitlab-com/gl-infra/scalability#1257.
Changelog: performance
|
|
When executing git-update-ref(1) with the `--stdin` flag, then the user
can queue updates and, since e48cf33b61 (update-ref: implement
interactive transaction handling, 2020-04-02), interactively drive the
transaction's state via a set of transactional verbs. This interactivity
is somewhat broken though: while the caller can use these verbs to drive
the transaction's state, the status messages which confirm that a verb
has been processed is not flushed. The caller may thus be left hanging
waiting for the acknowledgement.
This bug keeps us from using those status updates in the updateref
package, which effectively means that we cannot be sure whether the
state transition was successful until we try to commit changes. Add the
baseline for fixing this bug by applying the patch to our Git version.
|
|
For quite some time we're aware of the fact that mirror-fetches into
repositories with many refs are exceedingly slow. Most importantly, this
issue poses problems for our replication strategy where replication jobs
take so much time that replication targets are likely to be out of date
immediately after they have received a replication jobs because the
primary node has received additional mutators while the replication
target was fetching changes.
To address this problem, we have upstreamed a patch series into git.git
which speeds fetches up somewhat. Most importantly, this patch series
optimizes the way git-fetch(1) enumerates refs by making better use of
the commit-graph. The result is that mirror-fetches in the benchmarking
repository gitlab-org/gitlab have been sped up from originally 56s to
25s.
While it is unlikely that this speedup alone will fix our replication
issue, it is definitely an important step towards improving the
situation.
Changelog: performance
|
|
When a user pushes commits into a Git repository, then Git will do
a connectivity check to see whether it has all objects needed to satisfy
the ref updates. This check can be quite expensive depending on how many
refs the target repo has given that it is implemented as `git rev-list
--not --all`, which loads all objects pointed to by preexisting refs. In
repositories like gitlab-org/gitlab with about 2.3M refs, this typically
takes about 8 seconds. But the worst is that the user typically doesn't
even know what's going on given that there is no progress bar being
displayed.
We have thus upstreamed a set of patches which speed up the connectivity
checks. There are two major performance optimizations:
1. We stop sorting inputs in the connectivity check, which gives a
30% speedup.
2. Instead of loading referenced objects via the object database, we
use the commit-graph. This is another 30% speedup.
In total, this improves the time required from nearly 8 to 3 seconds.
Patches have been merged to upstream's "next" branch and are thus likely
going to be part of the next release. Given that this is still a few
months out, this commit backports those patches such that we can reap
the benefits earlier.
Changelog: performance
|
|
Recently, we've hit such a large amount of references in one of our
repos that `FetchSourceBranch()` always times out. The root cause of
this is that Git has to load all commits of the source repo in order to
negotiate refs which are in common with the remote reposiory when we
execute git-fetch(1). This is currently always hitting the object
database up to the point where git-fetch(1) is heavily dominated by
decompressing and parsing objects from disk.
To fix this, I'm currently upstreaming a patch to git.git which
optimizes the lookup to make use of the commit-graph: if a commit is
part of the commit-graph, then we don't need to hit the object database
but can instead parse it via this graph, which is a lot more efficient.
Benchmarks in the repo which is creating the problems for us has shown
that this brings down the time to fetch from 44 seconds to 19 seconds,
which is sufficient to unblock `FetchSourceBranch()` again.
While our process states that we don't apply any patches to Git before
they have hit "next", the problem is significant enough to make an
exception here given that community contributors cannot create merge
requests against gitlab-org/gitlab anymore. Furthermore, the patch
itself is simple enough and has only received positive feedback from
maintainers so far [1].
Apply the patch to Git to speed up git-fetch(1) in such repos with many
refs and unblock community contributors again.
[1]: https://public-inbox.org/git/08519b8ab6f395cffbcd5e530bfba6aaf64241a2.1628085347.git.ps@pks.im/
Changelog: performance
|
|
Commit 8d4884c01 (Makefile: Upgrade Git to v2.32.0, 2021-06-09) removed
usage of patch pack-bitmap-avoid-traversal-of-uninteresting-tag.patch
in the build, but forgot to remove the patch itself.
Let's do this now.
|
|
To have less places where the version of the gitaly module
needs to be changed, the module path is defined on the fly
for 'notice' task.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
|
|
The new "v14" version of the Gitaly module is named to match
the next GitLab release. The module versioning is needed in
order to pull gitaly as a dependency in other projects. The
change updates all imports to include v14 version. The go.mod
file was modified as well after go mod tidy execution. And
the changes in dependency licenses are reflected in the NOTICE
file.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
|
|
The path re-writer is the go script to re-write imports
in the go source code files, proto files and go.mod file.
The script accepts path to the project dir where go.mod file
locates, current module version and desired module version.
Upgrading a module requires re-generating the gRPC stubs
from proto file that is why the code of the path re-writer
script is imported in a new 'upgrade-module' task which covers
that need.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
|
|
Similar to https://gitlab.com/gitlab-org/gitlab/-/merge_requests/62012,
this updates Gitaly's CI tooling for the new changelog workflow.
|
|
Signed-off-by: Takuya Noguchi <takninnovationresearch@gmail.com>
|
|
We're about to drop installing Gitaly binaries into the source
directory, where current users should all use binaries from the build
directory instead. Our test-boot script is still using the old directory
though, so let's switch it over.
|
|
For the test repositories in _build/testrepos a fixed list of refs is
defined. This list will ensure tests remain to succeed, even if upstream
new refs are pushed.
This change adds two more refs which will be used in rebase tests in
future commits.
|
|
With commit eccda331e (blob: Enable use of bitmap indices when searching
LFS pointers, 2021-03-11), we have added a feature flag which allows the
use of bitmap indices to accelerate searching for LFS pointers. As it
turns out, our usage of git-rev-list(1) triggered a pathological edge
case when adding negative tags to its invocation when bitmap indices are
in use. This issue was fixed upstream with commit 540cdc11ad
(pack-bitmap: avoid traversal of objects referenced by uninteresting
tag, 2021-03-22), which has recently been merged to git's `next` branch.
Add the patch such that we can reenable the use of bitmap indices and
add more tests to verify that the since then newly added ListLFSPointers
and ListAllLFSPointers RPCs behave well with bitmap indices.
|
|
_support/test-boot checks that 'gitaly -version' reports the correct
version denoted by the VERSION file. If the versions mismatch, the
error message currently just says 'gitaly -version failed'. This is
not very helpful so this commit improves the situation by actually
printing out the expected and actual content.
One potential source of failure is when a merge request is made from
a fork that does not contain the latest tags, causing `git describe`
to produce different output. This commit also adds a message about this
to help community contributors working from forks.
|
|
This commit removes a few files that were still in the tree, but gone
unused. Changelog generation and updating downstream dependencies is no
longer our teams concern, but done by the Delivery team.
|
|
With multistage deployments we have both old and a new versions
of the praefect service running and sharing the same database.
We must be sure the changes done to the database schema doesn't
affect the old version of the praefect functionality and it
operates without issues.
The new job was added to the pipeline to verify compatibility of
the previous release and latest changes done to schema migration.
The pipeline runs only if a new migration was added/modified.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3338
|
|
The Docker images are generated for developers, but none of the Gitaly
developers use it. In fact, the images were broken for a while. With
this change the Gitaly project will no longer create our own docker
images, but we can still use the CNG images.
|
|
With introduction of database notifications listener we need
to configure direct connection between praefect and postgres
as PgBouncer can't serve LISTEN operation with transaction
scoped configuration.
The change adds praefect instances IPs to the list of allowed IPs
of the Postgres database. So each praefect is allowed to connect
directly to it.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3354
|
|
Now that we have upgraded to Ruby 2.7.2, we can just use bundler v2 and
drop the use of bundler v1.17.3.
|
|
gitlab-shell: Remove vendored files
See merge request gitlab-org/gitaly!2618
|
|
GitLab-Shell used to implement the Git hooks that performed
authorization and authentication. These were ported to Go, and since
61dcdf969231954c06f16a6222a7540460f4b4f0 these are merged, meaning that
the code this change removes isn't ever called. As such, we should
remove it.
Removing these files required updates to the Makefile as some make
targets are no longer needed as the gitlab-shell tests aren't executed
anymore.
|
|
Part of https://gitlab.com/groups/gitlab-org/-/epics/2380
|
|
The terraform install uses `bin/check` from the vendorred `gitlab-shell`
code to check the hooks are configured correctly. By changing this to
`gitaly-hooks` we can prepare for a later situation where `gitlab-shell`
is removed and as such not available.
Currently the `gitaly-hooks check` command executes `gitlab-shell` under
the hood, so the results are identical.
|
|
After having deployed the cluster, the first connection to deployed
machines is currently going to fail because of unknown host keys. Let's
improve this situation by scanning deployed hosts and adding their keys
to the known_hosts file automatically.
|
|
With cluster deployment now being done via Ansible, let's change the
README to reflect this change.
|
|
Right now, creation of the demo cluster is performed via a set of Ruby
scripts. Let's convert them to use Ansible, too, so tasks become
idempotent and we have less of a mixture between Ansible and custom Ruby
logic.
|
|
We're about to introduce a second playbook for cluster creation, so
let's rename the generic "playbook.yml" to "configure.yml".
|
|
Both `destroy-demo-cluster` and `print-info` are simple wrappers around
terraform. Using Ruby for that is overkill.
Let's convert both scripts to shell onliners.
|
|
Setup the omnibus gitaly as a storage "internal"
|
|
With PATH now containing a broken version of Git by design, we need to
adjust all callers of Git to use the real executable instead of the
broken one. The script `check-mod-tidy` is one of those which needs
adjusting.
Given it's such a tiny script, let's just inline it into our Makefile
and have it use the `${GIT}` variable.
|
|
Starting with commit 36437f18 (PgBouncer deployment with terraform,
2020-07-29), we've introduced a new dependency on the pgbouncer module
for Terraform. To download the module, the commit in question added a
call to `terraform get`, which causes us to download the module. But as
we're only executing `terraform init` in case no `.terraform` directory
exists and `terraform get` will unconditionally create that directory,
the result is that `terraform init` will never be executed.
Fix the issue by just removing the call to `terraform get`.
Initialization will fetch the pgbouncer module anyway, so there's no
need to do it explicitly.
|
|
Right now, our Terraform scripts only handle creation of machines in
GCP, but not their respective configuration. As it's a task we're doing
rather frequently which takes some time, this commit creates an Ansible
playbook which automates this task.
On creation of the cluster, our scripts now automatically generate a
`hosts.ini` file containing all necessary connection information. With
this file, the admin may now run `./configure-demo-cluster`, which will
invoke Ansible and automatically generate and apply configuration for
each of the nodes, restarting services as required. The task can be run
repeatedly in order to update configuration on the target nodes.
|
|
PgBouncer deployment with terraform
Closes #2975
See merge request gitlab-org/gitaly!2418
|
|
Resulting name of the IP for the SQL connection renamed
to praefect_pgbouncer_ip.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2975
|
|
Assignment of the dedicated IP address to PgBouncer instance
in order to narrow the set of public IP addresses allowed to
connect to PostgreSQL instance.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2975
|
|
Terraform requires a new module to be installed: pgbouncer
In order to install all required modules automatically
the `terraform get` command added to main script.
The command installs all missed dependencies recursively.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2975
|
|
As the target PostgeSQL version currently is 11 we should
use this version for tests and demo.
|
|
In order to verify usage of PgBouncer in front of Postgres
database PgBouncer included into terraform deployment.
It uses separate machine with internal IP that is accessible
to Praefect instances.
Cloud SQL authorized networks changed to '0.0.0.0/0' because
it is not possible to use PgBouncer IP for it, as PgBouncer
requires IP of Cloud SQL instance in setup (circular dependency).
The output of 'praefect_postgresql_ip' is a private IP of the
PgBouncer instance that should be used instead of a public Cloud
SQL instance to proxy SQL requests.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2975
|
|
|