Welcome to mirror list, hosted at ThFree Co, Russian Federation.

README.md « benchmarking « _support - gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: bc439f19f9d1f73e266b0031d71f77399bf0e004 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# Gitaly Benchmarking Tool

## What is this?

An Ansible script for running RPC-level benchmarks against Gitaly.

**Note**: You must be a member of the `gitaly-benchmark-0150d6cf` GCP group.

## Steps for use

### 1. Setup your environment

1. Ensure that [`gcloud`](https://cloud.google.com/sdk/docs/install) is installed and available on your path.
1. Ensure that `python` and `terraform` are installed, or use [`asdf`](https://asdf-vm.com/guide/getting-started.html) to install them (recommended).
1. Create a new Python virtualenv: `python3 -m venv env`
1. Activate the virtualenv: `source env/bin/activate`
1. Install Ansible: `python3 -m pip install -r requirements.txt`
1. **Optional**: Copy `config.yml.example` to `config.yml` to customize the machine type uses for benchmarking

### 2. Create instance

```shell
./create-benchmark-instance
```

This will create a Gitaly node and a small client node to send requests to
Gitaly over gRPC. This will prompt for the Gitaly revision to be built,
instance name, and public SSH key to use for connections.

Use the `gitaly_bench` user to SSH into the instance if desired:

```shell
ssh gitaly_bench@<INSTANCE_ADDRESS>
```

### 3. Configure instance

```shell
./configure-benchmark-instance
```

Build and install Gitaly from source with from desired reference and install
profiling tools like `perf` and `libbpf-tools`. A disk image containing the
test repositories will be mounted to `/mnt/git-repositories` on the Gitaly node.

### 4. Run benchmarks

```shell
./run-benchmarks
```

Run the benchmarks specified in `group_vars/all.yml`. By default Gitaly is
profiled with `perf` and `libbpf-tools` for flamegraphs and other metrics, which
may add ~10% overhead. Set the `profile` variable to `false` to disable profiling:

```shell
./run-benchmarks --extra-vars "profile=false"
```

On completion a tarball of the benchmark output will be written to
`results/benchmark-<GITALY_REVISION>-<BENCH_TIMESTAMP>.tar.gz`. This will
have a directory for each repository tested against each RPC containing:

- `ghz.json` - Output in JSON format from [ghz](https://ghz.sh) for the run.
- `gitaly.log` - The main Gitaly log file. Gitaly-Ruby logs are not included.

To retrieve the 99th percentile duration in milliseconds from `ghz.json` use:

```shell
jq '.latencyDistribution[] | select(.percentage==99) | .latency / 1000000' ghz.json
```

When profiling is enabled, the following are also present:

- `all-perf.svg` - Flamegraph built from a system-wide `perf` capture. This uses
  `--call-graph=dwarf` and will provide accurate stack traces for Git but
  Gitaly's will be invalid.
- `biolatency.txt` - Histogram of block I/O latency, separated by disk.
  `/mnt/git-repositories` will be disk `/dev/sdb`.
- `biotop.txt` - List of the processes performing the most block I/O.
- `cpu-dist-off.txt` - Histogram of duration programs spent unscheduled by the
  kernel.
- `cpu-dist-on.txt` - Histogram of duration programs spent running.
- `gitaly-execs.txt` - List of all processes forked by Gitaly and their command
  line arguments.
- `gitaly-perf.svg` - Flamegraph built from running `perf` against Gitaly only.
  This uses `--call-graph=fp` for accurate stack traces for Golang.
- `page-cachestat.txt` - Kernel page cache hit rate.

### 5. Destroy instance

```shell
./destroy-benchmark-instance
```

All nodes will be destroyed. As GCP will frequently reuse public IP addresses,
the addresses of the now destroyed instances are automatically removed from
your `~/.ssh/known_hosts` file to prevent connection failures on future runs.