Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-01-17fix: Fix a collection of typos found by typos-cliXing Xin
Fix typos found by typos-cli(https://github.com/crate-ci/typos). Some affected tests are adjusted. There are a bunch of other typos are ignored, including * CHANGELOG.md * NOTICE * internal/.../migrations/20201208163237_cleanup_notifications_payload.go * other intended typos or false positives Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
2023-08-30Merge branch 'pks-git-drop-test-repository' into 'master'Will Chandler
git: Remove the test repository See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6273 Merged-by: Will Chandler <wchandler@gitlab.com> Approved-by: karthik nayak <knayak@gitlab.com> Approved-by: Will Chandler <wchandler@gitlab.com> Reviewed-by: Patrick Steinhardt <psteinhardt@gitlab.com> Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
2023-08-29support/benchmarking: Stop using gitlab-shell configurationPatrick Steinhardt
The benchmarking deployments for Gitaly still use the gitlab-shell configuration. This setting is not required anymore though, and we already configure the GitLab secret explicitly. Remove the section.
2023-08-28tests: Drop test repositoryPatrick Steinhardt
We have converted all of our tests to generate their test data at runtime. Furthermore, all of our benchmarks use a dedicated benchmarking repository. This means that our test repository is completely unused now. Remove the Makefile target to clone and set up the test repository.
2023-04-24git: Unconditionally ignore gitconfig filesPatrick Steinhardt
A while ago we have introduced the `ignore_gitconfig` configuration. If set, we will override GIT_CONFIG_SYSTEM and GIT_CONFIG_GLOBAL as well as override XDG_CONFIG_HOME so that Git won't pick up gitconfig files found in any of these scopes. The goal of this is that we only ever use the Git configuration that is found either in Gitaly's `config.toml` or in the repository-local gitconfig. This toggle has been enabled in all distributions unconditionally already and was scheduled for removal in v16.0. So let's remove that toggle and unconditionally ignore any global- or system-level gitconfig files. Changelog: removed
2023-04-14ci: Build praefect config with envsubstToon Claes
Instead of using Ruby, build the praefect config file with `envsubst`.
2023-04-14tools: Rewrite test-boot in GoToon Claes
Replace the Ruby script _support/test-boot with a small tool written in Go. Issue: https://gitlab.com/gitlab-org/gitaly/-/issues/4636
2023-04-06benchmarking: Remove setting up gitaly-rubyToon Claes
2023-04-03ci: Remove testing with bad proxiesPatrick Steinhardt
One of our CI jobs is testing that Gitaly works correctly when invalid proxies have been configured. This test only checks the `rubyserver` package though, which is about to be removed. The tests are thus not required anymore. Remove the CI jobs and the infrastructure supporting it.
2023-03-02Updates benchmarking scripts for testing different disk typesJohn Jarvis
2023-02-02benchmarking: Clone Gitaly using target revisionWill Chandler
Currently we clone Gitaly at HEAD and then peel the requested revision and check that out. This works fine if you're using HEAD as your rev, but a branch name will fail as we haven't created a local branch with that name yet. To resolve this, perform the initial clone using the requested revision before peeling.
2023-02-02benchmarking: Fix gitlab-shell buildWill Chandler
With 51ea0f5 (Add support for the gssapi-with-mic auth method, 2023-01-23), gitlab-shell now requires native gssapi libraries to build. Update the ansible task to install the required packages.
2023-02-02benchmarking: Remove test-network from terraformWill Chandler
Network `test-network` preventing multiple benchmarking instances from being created and is unused. Remove it.
2023-02-02benchmarking: Update README.md with new optionsWill Chandler
Add description of benchmarking output to the README.
2023-02-02benchmarking: Use scripts in bench loopWill Chandler
Now that we have scripts to run benchmarks and profile Gitaly, we can update the `benchmark` role to invoke them. By default we clear the kernel page cache and run the profiling script, but if needed these can be disabled with `./run-benchmarks --extra-vars "profile=false clear_page_cache=false`. `bench_duration` defaults to be slightly longer than `profile_duration` to ensure that `ghz` is sending traffic for the full time we're profiling. `ghz_wait_duration` controls how long to wait before the `Run ghz` task is considered to have failed. When writing HTML output `ghz` may take 30+ seconds to finish, so a sizeable wait period helps prevent spurious failures without adding delays if it exits sooner. Currently we are using JSON output which does not add this delay.
2023-02-02benchmarking: Add profiling scriptWill Chandler
Understanding where Gitaly and Git are spending their time, as well as general system health are critical to useful benchmarking. Add a script to the Gitaly node to run `perf` and a number of `libbpf-tools` utilities while the node is under load. Running this introduces a performance overhead of ~10%, mostly from `perf`, which is run twice simultaneously. Once to profile only Gitaly using `--call-graph=fp`, which works well with Golang, and again for the system as whole using `--call-graph=dwarf`, which is more accurate for Git and other C programs. The DWARF output is ~10x larger than function pointer, causing flamegraphs built from it to take proportionately longer, typically longer than the duration profiled. The `libbpf-tools` utilities used are a bit of a grab bag, but quite lightweight to run. This are BPF CO-RE utilities that run much more lightly than `bcc`, which can be a resource hog. These focus primarily on determing the amount of delay block I/O imposes, which may be useful in determining how much of a penalty slower storage imposes on Gitaly. Currently the only RPC being tested is `FindCommit`, which being read-only hits the kernel page cache 100% of the time after the first request. - biolatency: Histogram of the latency of block I/O operations for each attached disk. https://github.com/iovisor/bcc/blob/master/tools/biolatency_example.txt - biotop: List of processes performing the most block I/O. https://github.com/iovisor/bcc/blob/master/tools/biotop_example.txt - execsnoop: List of all processes forked by Gitaly and their arguments. https://github.com/iovisor/bcc/blob/master/tools/execsnoop_example.txt - cpudist: Histogram of durations that programs executed by the kernel, or with the `--offcpu` flag, how long they were slept. https://github.com/iovisor/bcc/blob/master/tools/cpudist_example.txt - cachestat: Statistics regarding kernel page cache hit rate. https://github.com/iovisor/bcc/blob/master/tools/cachestat_example.txt Note that the links above are to the `bcc` documentation for each tool used. The arguments the `bcc` version takes may vary a bit from what `libbpf-tools` allows, but they perform the same task. Further work is needed for this be fully useable, most notably tracking CPU and memory utilization. This is difficult with polling tools like Prometheus's `node-exporter`, as most of the system load is typically from short-lived Git processes that may spawn and exit between polling intervals.
2023-02-02benchmarking: Add benchmarking script for clientWill Chandler
Using `ghz` requires a large number of parameters which would be quite painful to write in YAML. Create a wrapper script for the client host that does some basic parameter verification and then invokes `ghz`. Currently the --concurrency[0] and --rps[1] values used are arbitrary. In the future we should look into configuring these appropriately per RPC. For example, 100 `OptimizeRepository` requests per second is not a useful scenario. [0] https://ghz.sh/docs/options#-c---concurrency [1] https://ghz.sh/docs/options#-r---rps
2023-02-02benchmarking: Add initial FindCommit queryWill Chandler
To run requests via `ghz`, we need to provide a JSON-formatted file with the parameters being passed to the RPC. As an initial example, let's add a `FindCommit` request for `git.git`. Revision `bWFzdGVy` is `master` in base64.
2023-02-02benchmarking: Add benchmarking loopWill Chandler
We need to loop over each RPC bench and its associated repos, but Ansible's syntax to dynamically reuse tasks is a bit annoying and requires that we split out each section that will be repeated into a separate file and use `include_tasks`. To do this, we invoke `rpc_loop.yml` from `main.yml` for each RPC, and then `bench.yml` for each repo we're testing with the RPC.
2023-01-23Fix Rubocop errors in test-bootStan Hu
`.rubocop_todo.yml` is loaded from the ruby directory via `make rubocop`. It's easier just to fix these 3 errors than to try to fix the script.
2023-01-23Add Rubocop TODO in _support pathStan Hu
2023-01-17benchmarking: Add a READMEWill Chandler
Document the basic steps for using the benchmarking scripts. Changelog: added
2023-01-17benchmarking: Build profiling toolsWill Chandler
Output from `ghz` provides raw latency numbers, but understanding where Gitaly is spending its time requires more detail. To provide this, we will profile it with Linux `perf` and `libbpf-tools`. The latter are C versions of the older `bcc` BPF monitoring tools and are extremely lightweight to run in terms of both memory and CPU. `bcc` uses a python wrapper which can add significant load, particularly if multiple tools are run at once. `perf` output will be converted into flamegraphs.
2023-01-17benchmarking: Add systemd service for GitalyWill Chandler
Directly running Gitaly via Ansible as an asynchronous task would be painful as they require a hard deadline and can be a bit flaky. In addition, setting resource limits to the correct values would a pain. To avoid this, create a systemd service for Gitaly so we can use the `ansible.builtin.systemd` module to control it. Logs can still be easily retrieved using `journalctl`, and we can directly set resource limits to match those used in Omnibus GitLab[0]. [0] https://gitlab.com/gitlab-org/omnibus-gitlab/-/blob/73892599/config/templates/runit/runsvdir-start.erb#L20-37
2023-01-17benchmarking: Add gitaly roleWill Chandler
Build Gitaly itself, along with Gitaly-Ruby and gitlab-shell. The versions to use for Go, Ruby, and Gitaly are all parsed on the client node, so that role must be run first. To support installing arbitrary version of Ruby, we use ruby-build to compile it from source. To ensure we are using the most relevant Git version, we build Gitaly with the bundled Git option enabled. Currently gitlab-shell is built so Gitaly can validate its presence, but is not used.
2023-01-17benchmarking: Add client roleWill Chandler
The client node will need to send gRPC traffic to Gitaly. We will use ghz[0] to do this, as it has extensive and well-documented[1] options, and supports streaming RPCs, unlike `k6s` used by the GPT. To send RPCs we need the protobuf definition files, so the Gitaly repo must be cloned on the client as well for reference. To ensure we are using the same commit on all hosts, we peel the requested revision after cloning and persist this as a fact that can be referred to from other hosts. The Go and Ruby versions used by the designated commit are also parsed from `.tool-versions` and saved for use on the Gitaly node. We will also need to copy the `ghz` output over to the Gitaly node for collection with the other benchmark outputs. In preparation for that, create a new SSH key and save its pubkey as another fact so we can trust it on the Gitaly node. [0] https://github.com/bojand/ghz [1] https://ghz.sh
2023-01-17benchmarking: Add destroy roleWill Chandler
Add an additional role to tear down a benchmarking instance when it is no longer required. This also removes the destroyed instances' host keys from known hosts as GCP will frequently re-use the same IP address on new nodes. This causes host key verification errors if we don't clean up known hosts.
2023-01-17benchmarking: Add role to create benchmark hostsWill Chandler
Basic benchmarking will require two hosts, a Gitaly instance and a client instance to send traffic from. Gitaly Cluster is beyond the scope of this initial effort. Create a terraform job that creates both hosts, with port 8075 open on the Gitaly node for traffic. We use a `t2d` instance for Gitaly as these provide 4 physical cores, as opposed to 2 hyperthreaded cores. In theory this could reduce performance jitter, though I have not measured this to be sure. A disk image containing the test repositories is attached to the Gitaly node on creation. These repositories are: - git.git - A smaller repository with a fair amount of history. - gitlab.git - Uses an object pool and has ~6,000,000 refs. - linux.git - A large and well-groomed repository. - homebrew-core.git - Has very large trees. - chromium.git - Extremely large (40 GiB), with ~4,000,000 refs. This task borrows its structure from the old Gitaly Cluster demo script in `_support/terraform`.
2023-01-17benchmarking: Add Ansible configWill Chandler
Ansible can run slowly when performing a large number of operations, and determining which tasks are slow is difficult with the default output. Mitigate these issues by enabling pipelining[0], which speeds things up dramatically and is compatible with the Ubuntu 22.04 hosts we're using, and the `profile_tasks` callback[1] which print the start time of each task during execution and a summary of task times on completion. [0] https://docs.ansible.com/ansible/latest/reference_appendices/config.html#ansible-pipelining [1] https://docs.ansible.com/ansible/latest/collections/ansible/posix/profile_tasks_callback.html
2022-12-14tests: Remove unneeded seed repositoriesPatrick Steinhardt
Remove infrastructure to clone the "gitlab-test-mirror.git" and "gitlab-git-test.git" seed repositories. They are not used anymore.
2022-11-11Makefile: Use Gitaly's tagged Git versions instead of ad-hoc patchingPatrick Steinhardt
Right now, the way we apply Git patches is by adding them to the Gitaly project and using git-apply(1) to apply them ad-hoc. This is starting to show its limits though: - It is hard for us to provide a simple pointer to the Git sources that we distribute to the customer. - It is hard to execute tests for the patched Git version in an automated fashion. - It is hard to work on top of the already-patched Git distribution to for example apply more patches. These limitations are getting more noticeable now that we have split up the Gitaly team into two teams, where we potentially want to backport patches more aggressively. With the split we have now made the repository at [1] the canonical Git repository for all our efforts. This indeed opens up a much better way to use custom Git versions: instead of hosting the patches in the Gitaly repository, we start to tag Gitaly-specific releases in that repository. These releases then carry all the additional patches on top. This makes it trivial to use normal workflows: - We can point customers to the Gitaly-specific tags which hold our patches on top. - We can automatically trigger CI pipelines on top of patched Gitaly-specific releases by just pushing branches or tags. - You can just clone the repository and checkout out the specific tags. Drop the infrastructure to patch releases in-place in favor of this new architecture. [1]: https://gitlab.com/gitlab-org/git.git
2022-10-28ruby: Move scripts that generate Proto sources into tools directoryPatrick Steinhardt
We've got multiple scripts that are required to generate Ruby code from our Protobuf definitions in the `_support` directory. This has multiple smells: - It's out-of-line with all the other tools, which nowadays are located in the `tools` directory. - It's hard to discover and find out which parts logically form a unit. - We are reusing the Gemfile of the Ruby sidecar to pin the `grpc-tools` dependency to a specific version. Move the tooling into its own `tools/protogem` directory that's got its own Gemfile to fix these points. This also allows us to auto-update dependencies via the Renovate bot like we do for our other tools.
2022-10-21Makefile: Remove bundled git v2.35.0Karthik Nayak
In the commit (cc04215eb) we removed the flag to enable git v2.37.0. Making it now the default git version. Now we can remove the older git version v2.35.0. In this commit remove it from the Makefile. This means it will no longer be bundled with Gitaly. Also remove the patches added for git v2.35.0, which are no longer required.
2022-08-10test-boot: Switch Git path to use binary wrappersPatrick Steinhardt
We're about to stop installing Git into our current default location. Instead, tests are supposed to use the binary wrappers provided by the Git project so that we don't have to install it in the first place. Adapt the test-boot script to use them.
2022-08-01cli: Update `gitaly-hooks check` referencesjt-move-gitaly-hooks-checkJustin Tobler
The `check` subcommand has been relocated from `gitaly-hooks` to the main `gitaly` binary. References to the subcommand were updated to reflect this change.
2022-07-13Makefile: Update Git to v2.37.1pks-git-v2.37.1Patrick Steinhardt
Update our bundled Git version to v2.37.1. This both updates our major version to include the latest changes from v2.37, but also updates our minor version to include fixes for CVE-2022-29187, which is another variant of opening repositories owned by a different user leading to privilege escalation. To the best of my knowledge, Gitaly is not impacted by this specific vulnerability. It does not perform repository discovery by walking up the filesystem hierarchy and thus wouldn't pick up repositories in any of the parent directories of the storage root. And if an adversary is in a posititon to change the owner of repositories contained in Gitaly's storage root, they would already have other ways to attack the host. Also note that we're upgrading the bundled Git version v2.36.1 in-place. This can be done because its feature flag is not yet default-enabled and hasn't been rolled out anywhere due to a set of incompatibilities. Changelog: changed
2022-07-07praefect-schema: Update the Praefect schemaPatrick Steinhardt
We have changed the Postgres client version due to our update to a more recent GitLab Build Image, which caused some minor changes in the Praefect schema. Update the schema to match.
2022-05-06tools: Move `noticegen` into top-level `tools/` directoryPatrick Steinhardt
Move the `noticegen` tool into the top-level `tools/` directory so that all of our custom build tools are in one place. This also makes its sources discoverable for our formatter.
2022-05-06tools: Move `module-updater` into top-level `tools/` directoryPatrick Steinhardt
Move the `module-updater` tool into the top-level `tools/` directory so that all of our custom build tools are in one place. This also makes its sources discoverable for our formatter.
2022-05-06tools: Move Protoc plugins into top-level `tools/` directoryPatrick Steinhardt
The Protoc plugins we use are hidden away deep into the `proto/` directory, which makes it very hard to discover them when one doesn't already know about their existence. Let's move them into a new top-level `tools/` directory.
2022-05-06protoc-gen-gitaly-lint: Absorb internal `linter` packagePatrick Steinhardt
There is no real reason why the `protoc-gen-gitaly-lint` package requires another internal package to provide the actual logic. Furthermore, we want to move this plugin into a top-level `tools` directory to make it easier to discover. Absorb the `linter` package to make it easier to move the code around.
2022-05-02Makefile: Install bundled Git v2.36.0.gl1Patrick Steinhardt
Install bundled Git v2.36.0.gl1 alongside v2.35.1.gl1. Note that we carry forward a set of patches from the old version which hasn't made it into the final release yet. Changelog: added
2022-04-26Merge branch 'pks-drop-bundled-git-v2.33.1' into 'master'Toon Claes
Makefile: Drop bundled Git v2.33.1.gl3 See merge request gitlab-org/gitaly!4495
2022-04-25Merge branch 'smh-verification-trigger-only-generation' into 'master'Sami Hiltunen
Ignore verification columns for read-only cache updates Closes #4159 See merge request gitlab-org/gitaly!4468
2022-04-25Makefile: Drop bundled Git v2.33.1.gl3pks-drop-bundled-git-v2.33.1Patrick Steinhardt
We have finished the migration to bundled Git v2.35.1.gl1 in v14.10. Due to concerns with zero-downtime upgrades we couldn't yet remove the old version though. But now that we have waited for a release we can finally remove the old version. Remove the infrastructure to build and install bundled Git v2.33.1.gl3. Changelog: removed
2022-04-20Release expired verification leases periodicallySami Hiltunen
The background verifier sets a lease time on a replica when it picks it up for verification. If the worker dies for some reason, the lease will remain in place and no other worker will pick up the replica for verification again until the lease is cleared. The lease itself tells the maximum time the worker itself would be working on the replica. After it has been passed, it would be safe for another worker to pick up the replica for verification again. This commit adds a background goroutine that periodically releases expired leases so other workers can take up the work if the original worker failed and did not release the lease. The 'verificaton_leases' index is added so the query can efficiently find the replicas with leases acquired to find the stale ones.
2022-04-13Ignore verification columns for read-only cache updatessmh-verification-trigger-only-generationSami Hiltunen
Read-only cache receives invalidations on record updates via triggers in Postgres. Currently the notifications are sent for any modification to the records. The verification related columns are not relevant to the operation of the cache so this commit ignores the changes to the columns in the triggers. Changelog: changed
2022-04-13Add migrations for background verification schemaSami Hiltunen
This commit adds the necessary schema changes for the metadata background verification. Each replica receives two new columns: 1. 'verified_at' which contains the timestamp of the last successful verification of the replica. This effectively allows for identifying replicas that are in need of reverification. 2. 'verification_leased_until' which contains a timestamp until which a worker has acquired a lease to reverify the repository. This prevents multiple workers from picking the same repository for reverification at the same time. 'verification_queue' index is added to index replicas which have not been acquired by any worker. This allows for efficientl querying replicas that are in need of reverification later. Changelog: other
2022-03-31Makefile: Group Git patches by versionPatrick Steinhardt
The current set of Git patches got quite big, and consequentially it's hard to see which patches belong to what version. Reorder them into a per-version subdirectory so that the grouping is clear. Furthermore, this allows us to find all patches via wildcards instead of having to manually list them in our Makefile.
2022-03-29Lint test-boot when running rubocopJames Fargher