Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-01-04git: add upload-pack buffer size patchJacob Vosmaer
Changelog: performance
2021-11-30praefect: Track database schema in GitPatrick Steinhardt
Given that our Praefect database schema is seeded by migrations, there is no single place which gives an overview over the complete schema. While this is information that can be easily obtained by connecting to a Praefect database, this workaround doesn't really work across different Praefect versions given that one would have to re-seed the database for each version one is about to investigate. This makes it hard to get info about the current database schema and to compare changes to the schema between different versions. Introduce a new script "generate-praefect-schema" to fix this issue. The script connects to a Postgres server, creates a new database, seeds the database by executing `praefect sql-migrate` and then dumps the resulting schema. Furthermore, a new Makefile target automates this and writes the dump into our `_support` directory. The intent is that whenever the database changes, the author of those changes must update the schema in our `_support` folder such that it's easily possible to diff the schema across different versions in our Git history. To ensure that the schema is always up to date, a new CI job is added which performs the dump and then checks that no changes occurred.
2021-11-25ci: Use postgres_definition in Praefect smoke test jobPatrick Steinhardt
The Praefect smoke test job sets up the Postgres service on its own, even though we have the `postgres_definition` which would already go most of the way of setting it up correctly. Convert the test to use the shared Postgres service definition instead to avoid code duplication. The only thing that needs to change for the definition is that we didn't have a database name defined at all, which is added as part of this commit.
2021-11-19Merge branch 'sh-update-postgresql-12' into 'master'Sami Hiltunen
demo: Update Cloud SQL database to PostgreSQL 12 See merge request gitlab-org/gitaly!4074
2021-11-17demo: Activate PostgreSQL slow logsStan Hu
This commit will cause PostgreSQL to log any database statements taking longer than 500 ms. This is useful to flag any slow queries running in Praefect.
2021-11-15demo: Update Cloud SQL database to PostgreSQL 12Stan Hu
We are shipping and using PostgreSQL 12 on GitLab.com, so let's use this version to match production.
2021-10-19Remove backwards_compatibility CI jobSami Hiltunen
The backwards_compatibility CI job runs the tests with the latest migrations against the test suite from the previous version. While testing the backwards compatibility of the migrations is important, the current approach has problems that make it unfeasible in practice. 1. The CI job figures out the previous version by reading the VERSION file from Gitaly's repo. This version file appears to be unupdated in Gitaly's master branch. As it is, it points to 14.3.0-rc2 when in fact it should point to some release candidate of 14.4. This means the tests are not against the latest release but the one before that. 2. Some assertions the tests make fail with newer migrations even if everything would be fine with a real setup. Some tests haven't setup their test state to fully match the production setup, and thus fail when the newer migrations are applied. In the same fashion, some tests fail with newer migrations even if they've setup their test state correctly. They may however make assertions on how the state looks like after running some code, yet the new migrations cause the state to look different from the asserted state. As an example, Praefect's deletion handling was changed in gitlab-org/gitaly!3906. The new code removes all records of a repository when the repository is deleted. The old code testing deletion fails as it still expects the records to be in place. The new code is backwards compatible though and would process deletion jobs scheduled by the old code. While testing for backwards compatibility is important, this commit removes the job as it's not clear how to solve the problems. This unblocks further fixes from being merged.
2021-10-13git: Update to v2.31.1Patrick Steinhardt
Git v2.31.1 has been released on October 12th with a big range of fixes. Most relevant to us, it also contains fixes for git-update-ref(1) to properly flush output when having a conversation with `--stdin`, which we're currently carrying a custom patch for. Bump our Git version to v2.31.1 and drop the custom patch. Furthermore, adjust `FlushesUpdaterefStatus()` to reflect that v2.31.1 now support flushing.
2021-10-11test-boot: Set up Git binary pathPatrick Steinhardt
We use the test-boot script to determine whether the Gitaly server successfully boots up with a given Gitaly configuration. This config currently not contain the path to the Git binary though, which is why it will instead pick up the system-provided Git binary. This wouldn't typically be the correct version though, given that we'd likely want to boot Gitaly with the Git executable produced by `make git`. The result is that Gitaly would refuse to boot if the system-provided version of Git does not meet the minimum required version. Fix the issue by setting the Git binary path.
2021-09-20global: Replace deprecated usage of `ioutil.ReadFile()`Patrick Steinhardt
With Go 1.16, the ioutil package was deprecated. Replace our usage of `ioutil.ReadFile()` with `os.ReadFile()` to adapt accordingly.
2021-09-15Makefile: apply Git ref advertisement buffering patchesJacob Vosmaer
Prior to these patches, Git used unbuffered writes for upload-pack ref advertisements, with one or two write syscalls per ref. In repositories with many refs that adds up to a lot of syscalls. These patches add stdio buffering for the ref advertisement writes. Also see gitlab-com/gl-infra/scalability#1257. Changelog: performance
2021-09-14Makefile: Fix git-update-ref(1) not flushing status updatesPatrick Steinhardt
When executing git-update-ref(1) with the `--stdin` flag, then the user can queue updates and, since e48cf33b61 (update-ref: implement interactive transaction handling, 2020-04-02), interactively drive the transaction's state via a set of transactional verbs. This interactivity is somewhat broken though: while the caller can use these verbs to drive the transaction's state, the status messages which confirm that a verb has been processed is not flushed. The caller may thus be left hanging waiting for the acknowledgement. This bug keeps us from using those status updates in the updateref package, which effectively means that we cannot be sure whether the state transition was successful until we try to commit changes. Add the baseline for fixing this bug by applying the patch to our Git version.
2021-09-09Makefile: Apply Git patches to speed up fetchesPatrick Steinhardt
For quite some time we're aware of the fact that mirror-fetches into repositories with many refs are exceedingly slow. Most importantly, this issue poses problems for our replication strategy where replication jobs take so much time that replication targets are likely to be out of date immediately after they have received a replication jobs because the primary node has received additional mutators while the replication target was fetching changes. To address this problem, we have upstreamed a patch series into git.git which speeds fetches up somewhat. Most importantly, this patch series optimizes the way git-fetch(1) enumerates refs by making better use of the commit-graph. The result is that mirror-fetches in the benchmarking repository gitlab-org/gitlab have been sped up from originally 56s to 25s. While it is unlikely that this speedup alone will fix our replication issue, it is definitely an important step towards improving the situation. Changelog: performance
2021-08-30git: Speed up connectivity checksPatrick Steinhardt
When a user pushes commits into a Git repository, then Git will do a connectivity check to see whether it has all objects needed to satisfy the ref updates. This check can be quite expensive depending on how many refs the target repo has given that it is implemented as `git rev-list --not --all`, which loads all objects pointed to by preexisting refs. In repositories like gitlab-org/gitlab with about 2.3M refs, this typically takes about 8 seconds. But the worst is that the user typically doesn't even know what's going on given that there is no progress bar being displayed. We have thus upstreamed a set of patches which speed up the connectivity checks. There are two major performance optimizations: 1. We stop sorting inputs in the connectivity check, which gives a 30% speedup. 2. Instead of loading referenced objects via the object database, we use the commit-graph. This is another 30% speedup. In total, this improves the time required from nearly 8 to 3 seconds. Patches have been merged to upstream's "next" branch and are thus likely going to be part of the next release. Given that this is still a few months out, this commit backports those patches such that we can reap the benefits earlier. Changelog: performance
2021-08-05git: Speed up fetches in repos with many refsPatrick Steinhardt
Recently, we've hit such a large amount of references in one of our repos that `FetchSourceBranch()` always times out. The root cause of this is that Git has to load all commits of the source repo in order to negotiate refs which are in common with the remote reposiory when we execute git-fetch(1). This is currently always hitting the object database up to the point where git-fetch(1) is heavily dominated by decompressing and parsing objects from disk. To fix this, I'm currently upstreaming a patch to git.git which optimizes the lookup to make use of the commit-graph: if a commit is part of the commit-graph, then we don't need to hit the object database but can instead parse it via this graph, which is a lot more efficient. Benchmarks in the repo which is creating the problems for us has shown that this brings down the time to fetch from 44 seconds to 19 seconds, which is sufficient to unblock `FetchSourceBranch()` again. While our process states that we don't apply any patches to Git before they have hit "next", the problem is significant enough to make an exception here given that community contributors cannot create merge requests against gitlab-org/gitlab anymore. Furthermore, the patch itself is simple enough and has only received positive feedback from maintainers so far [1]. Apply the patch to Git to speed up git-fetch(1) in such repos with many refs and unblock community contributors again. [1]: https://public-inbox.org/git/08519b8ab6f395cffbcd5e530bfba6aaf64241a2.1628085347.git.ps@pks.im/ Changelog: performance
2021-06-10git-patches: Remove leftover patchChristian Couder
Commit 8d4884c01 (Makefile: Upgrade Git to v2.32.0, 2021-06-09) removed usage of patch pack-bitmap-avoid-traversal-of-uninteresting-tag.patch in the build, but forgot to remove the patch itself. Let's do this now.
2021-05-27Dynamically define module path to exclude from licence scanPavlo Strokov
To have less places where the version of the gitaly module needs to be changed, the module path is defined on the fly for 'notice' task. Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
2021-05-27Create module v14 gitaly versionPavlo Strokov
The new "v14" version of the Gitaly module is named to match the next GitLab release. The module versioning is needed in order to pull gitaly as a dependency in other projects. The change updates all imports to include v14 version. The go.mod file was modified as well after go mod tidy execution. And the changes in dependency licenses are reflected in the NOTICE file. Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
2021-05-27Import path re-writerPavlo Strokov
The path re-writer is the go script to re-write imports in the go source code files, proto files and go.mod file. The script accepts path to the project dir where go.mod file locates, current module version and desired module version. Upgrading a module requires re-generating the gRPC stubs from proto file that is why the code of the path re-writer script is imported in a new 'upgrade-module' task which covers that need. Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3177
2021-05-21Update tooling for the new changelog workflowYorick Peterse
Similar to https://gitlab.com/gitlab-org/gitlab/-/merge_requests/62012, this updates Gitaly's CI tooling for the new changelog workflow.
2021-05-12Use text/template to generate NOTICETakuya Noguchi
Signed-off-by: Takuya Noguchi <takninnovationresearch@gmail.com>
2021-05-07test-boot: Switch to use Gitaly from build directoryPatrick Steinhardt
We're about to drop installing Gitaly binaries into the source directory, where current users should all use binaries from the build directory instead. Our test-boot script is still using the old directory though, so let's switch it over.
2021-04-27gitlab-test: Add a few more refsToon Claes
For the test repositories in _build/testrepos a fixed list of refs is defined. This list will ensure tests remain to succeed, even if upstream new refs are pushed. This change adds two more refs which will be used in rebase tests in future commits.
2021-04-20Makefile: Add git patch to fix pathological perf with bitmap indicesPatrick Steinhardt
With commit eccda331e (blob: Enable use of bitmap indices when searching LFS pointers, 2021-03-11), we have added a feature flag which allows the use of bitmap indices to accelerate searching for LFS pointers. As it turns out, our usage of git-rev-list(1) triggered a pathological edge case when adding negative tags to its invocation when bitmap indices are in use. This issue was fixed upstream with commit 540cdc11ad (pack-bitmap: avoid traversal of objects referenced by uninteresting tag, 2021-03-22), which has recently been merged to git's `next` branch. Add the patch such that we can reenable the use of bitmap indices and add more tests to verify that the since then newly added ListLFSPointers and ListAllLFSPointers RPCs behave well with bitmap indices.
2021-04-14improve version check error messageSami Hiltunen
_support/test-boot checks that 'gitaly -version' reports the correct version denoted by the VERSION file. If the versions mismatch, the error message currently just says 'gitaly -version failed'. This is not very helpful so this commit improves the situation by actually printing out the expected and actual content. One potential source of failure is when a merge request is made from a fork that does not contain the latest tags, causing `git describe` to produce different output. This commit also adds a message about this to help community contributors working from forks.
2021-02-22_support: Remove unused filesZeger-Jan van de Weg
This commit removes a few files that were still in the tree, but gone unused. Changelog generation and updating downstream dependencies is no longer our teams concern, but done by the Delivery team.
2021-02-03Integration test to verify backwards compatibilityPavlo Strokov
With multistage deployments we have both old and a new versions of the praefect service running and sharing the same database. We must be sure the changes done to the database schema doesn't affect the old version of the praefect functionality and it operates without issues. The new job was added to the pipeline to verify compatibility of the previous release and latest changes done to schema migration. The pipeline runs only if a new migration was added/modified. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3338
2021-01-22docker: Remove image generationZeger-Jan van de Weg
The Docker images are generated for developers, but none of the Gitaly developers use it. In fact, the images were broken for a while. With this change the Gitaly project will no longer create our own docker images, but we can still use the CNG images.
2020-12-30Demo: configuration of direct connection to PostgresPavlo Strokov
With introduction of database notifications listener we need to configure direct connection between praefect and postgres as PgBouncer can't serve LISTEN operation with transaction scoped configuration. The change adds praefect instances IPs to the list of allowed IPs of the Postgres database. So each praefect is allowed to connect directly to it. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3354
2020-10-30Update to Ruby bundler v2.1.4Stan Hu
Now that we have upgraded to Ruby 2.7.2, we can just use bundler v2 and drop the use of bundler v1.17.3.
2020-10-23Merge branch 'zj-remove-gitlab-shell' into 'master'Pavlo Strokov
gitlab-shell: Remove vendored files See merge request gitlab-org/gitaly!2618
2020-10-23gitlab-shell: Remove vendored filesZeger-Jan van de Weg
GitLab-Shell used to implement the Git hooks that performed authorization and authentication. These were ported to Go, and since 61dcdf969231954c06f16a6222a7540460f4b4f0 these are merged, meaning that the code this change removes isn't ever called. As such, we should remove it. Removing these files required updates to the Makefile as some make targets are no longer needed as the gitlab-shell tests aren't executed anymore.
2020-10-23Update Ruby to v2.7.2Stan Hu
Part of https://gitlab.com/groups/gitlab-org/-/epics/2380
2020-10-01demo: Verify Gitaly install with gitaly-hooksZeger-Jan van de Weg
The terraform install uses `bin/check` from the vendorred `gitlab-shell` code to check the hooks are configured correctly. By changing this to `gitaly-hooks` we can prepare for a later situation where `gitlab-shell` is removed and as such not available. Currently the `gitaly-hooks check` command executes `gitlab-shell` under the hood, so the results are identical.
2020-09-14terraform: Scan and add SSH host keysPatrick Steinhardt
After having deployed the cluster, the first connection to deployed machines is currently going to fail because of unknown host keys. Let's improve this situation by scanning deployed hosts and adding their keys to the known_hosts file automatically.
2020-09-14terraform: Adjust README file to match latest changesPatrick Steinhardt
With cluster deployment now being done via Ansible, let's change the README to reflect this change.
2020-09-14terraform: Convert cluster creation to use AnsiblePatrick Steinhardt
Right now, creation of the demo cluster is performed via a set of Ruby scripts. Let's convert them to use Ansible, too, so tasks become idempotent and we have less of a mixture between Ansible and custom Ruby logic.
2020-09-14terraform: Rename generic "playbook.yml"Patrick Steinhardt
We're about to introduce a second playbook for cluster creation, so let's rename the generic "playbook.yml" to "configure.yml".
2020-09-14terraform: Convert terraform wrappers to use shellPatrick Steinhardt
Both `destroy-demo-cluster` and `print-info` are simple wrappers around terraform. Using Ruby for that is overkill. Let's convert both scripts to shell onliners.
2020-09-04Add omnibus gitaly to terraform setupJames Fargher
Setup the omnibus gitaly as a storage "internal"
2020-08-20Makefile: Inline check-mod-tidyPatrick Steinhardt
With PATH now containing a broken version of Git by design, we need to adjust all callers of Git to use the real executable instead of the broken one. The script `check-mod-tidy` is one of those which needs adjusting. Given it's such a tiny script, let's just inline it into our Makefile and have it use the `${GIT}` variable.
2020-08-05terraform: Fix fetching modulesPatrick Steinhardt
Starting with commit 36437f18 (PgBouncer deployment with terraform, 2020-07-29), we've introduced a new dependency on the pgbouncer module for Terraform. To download the module, the commit in question added a call to `terraform get`, which causes us to download the module. But as we're only executing `terraform init` in case no `.terraform` directory exists and `terraform get` will unconditionally create that directory, the result is that `terraform init` will never be executed. Fix the issue by just removing the call to `terraform get`. Initialization will fetch the pgbouncer module anyway, so there's no need to do it explicitly.
2020-08-05terraform: Create Ansible playbook to roll out configurationPatrick Steinhardt
Right now, our Terraform scripts only handle creation of machines in GCP, but not their respective configuration. As it's a task we're doing rather frequently which takes some time, this commit creates an Ansible playbook which automates this task. On creation of the cluster, our scripts now automatically generate a `hosts.ini` file containing all necessary connection information. With this file, the admin may now run `./configure-demo-cluster`, which will invoke Ansible and automatically generate and apply configuration for each of the nodes, restarting services as required. The task can be run repeatedly in order to update configuration on the target nodes.
2020-07-30Merge branch 'ps-pgbouncer-terraform' into 'master'Paul Okstad
PgBouncer deployment with terraform Closes #2975 See merge request gitlab-org/gitaly!2418
2020-07-30PgBouncer deployment with terraformPavlo Strokov
Resulting name of the IP for the SQL connection renamed to praefect_pgbouncer_ip. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2975
2020-07-30PgBouncer deployment with terraformPavlo Strokov
Assignment of the dedicated IP address to PgBouncer instance in order to narrow the set of public IP addresses allowed to connect to PostgreSQL instance. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2975
2020-07-29PgBouncer deployment with terraformPavlo Strokov
Terraform requires a new module to be installed: pgbouncer In order to install all required modules automatically the `terraform get` command added to main script. The command installs all missed dependencies recursively. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2975
2020-07-28Update of PostgeSQL instance versionPavlo Strokov
As the target PostgeSQL version currently is 11 we should use this version for tests and demo.
2020-07-24PgBouncer deployment with terraformPavlo Strokov
In order to verify usage of PgBouncer in front of Postgres database PgBouncer included into terraform deployment. It uses separate machine with internal IP that is accessible to Praefect instances. Cloud SQL authorized networks changed to '0.0.0.0/0' because it is not possible to use PgBouncer IP for it, as PgBouncer requires IP of Cloud SQL instance in setup (circular dependency). The output of 'praefect_postgresql_ip' is a private IP of the PgBouncer instance that should be used instead of a public Cloud SQL instance to proxy SQL requests. Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2975
2020-07-22Remove Praefect primary from configPaul Okstad