Welcome to mirror list, hosted at ThFree Co, Russian Federation.

trace2_integration.md « doc - gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: 28757d671deb6a9b1e1caf6f1af058bd1deffdbe (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
# Trace2 integration

Git includes a tool called [Trace2](https://git-scm.com/docs/api-trace2) that:

- Allows Git processes to display important underlying events and related
  measurements.
- Records child process activities.

However, enabling Trace2 may cause increased overhead for Git processes and could
potentially pose risks when transmitting events between Gitaly and Git processes.
Therefore, you should not enable Trace2 for all Git processes.

Gitaly offers the capability of Trace2 activation on a per subset basis for Git
commands through the use of a "hook" mechanism. This mechanism offers the
advantage of providing more precise monitoring, lowering the risk of potential
hazards previously associated with global Trace2 activation, and reducing
unnecessary overhead.

For a list of available Trace2 hooks, see the
[`trace2hooks`](https://gitlab.com/gitlab-org/gitaly/-/tree/master/internal/git/trace2hooks)
directory in the Gitaly repository. Use this list to carefully select which Git
commands you want to monitor more granularly to increase the command's
visibility without incurring an excessive resource burden.

## How Trace2 is integrated to Gitaly

```mermaid
flowchart TD
  git.CommandFactory --WithTrace2Hooks option--> trace2.Manager
  git.CommandFactory -- Spawns --> command.New
  trace2.Manager -- Injects ENVs --> command.New
  trace2.Manager -- Manages --> trace2hooks.TracingExporter
  trace2.Manager -- Manages --> trace2hooks.PackObjectsMetrics
  command.New -- Spawns --> GitProcess[Git process]
  GitProcess -- Writes to --> Tempfile["/tmp/gitaly-trace-2352"]
  Tempfile -- Collected and parsed --> trace2.Trace
  trace2hooks.TracingExporter -- Handles --> trace2.Trace
  trace2hooks.PackObjectsMetrics -- Handles --> trace2.Trace
```

The overview flow of trace2 integration is demonstrated in the above chart.

Trace2 is enabled by passing some environment variables to spawning Git
processes. For a list of environment variables, see `GIT_TRACE2*` environment
variables on the [documentation page](https://git-scm.com/docs/api-trace2).
We don't set all of them.

Three environment variables enable the Trace2 instrumentation process:

- `GIT_TRACE2_EVENT`. Enables Git export instrumentation data in JSON format.
  Supports dumping the data to `stderr`, to a file, or to a UNIX socket. For
  more information,
  see [Enabling a Target](https://git-scm.com/docs/api-trace2#_enabling_a_target).
- `GIT_TRACE2_PARENT_SID`. Replaces the default auto-generated prefix of
  generated events. We set it to correlation ID from the application context.
- `GIT_TRACE2_BRIEF`. Reduces the amount of output information, such
  as `file`, `line`, and even absolute `time`.

The events are structured as a flat list of events in chronological order. Each
event describes a certain sub operation, including some relative data and
metadata. Some events, such as `"region_enter"` or `"cmd_start,"`, indicate a new
section in which the consecutive events belong. Correspondingly, exiting events,
such as `"region_leave"` or `"atexit,"`, exit the current section.

The following snippet is the resulting events of a `git-pack-objects` command.

```json
{"event":"version","thread":"main","time":"2023-02-22T12:05:04.840009Z", "evt":"3","exe":"2.39.1"}
{"event":"start","thread":"main","t_abs":0.002338,"argv":["git","pack-objects","toon","--compression=0"]}
{"event":"def_repo","thread":"main","repo":1,"worktree":"/gitaly"}
{"event":"cmd_name","thread":"main","name":"pack-objects","hierarchy":"pack-objects"}
{"event":"region_enter","thread":"main","repo":1,"nesting":1,"category":"pack-objects","label":"enumerate-objects"}
{"event":"region_enter","thread":"main","repo":1,"nesting":2,"category":"progress","label":"Enumerating objects"}
{"event":"region_leave","thread":"main","repo":1,"t_rel":0.001083,"nesting":2,"category":"progress","label":"Enumerating objects"}
{"event":"region_leave","thread":"main","line":4490,"repo":1,"t_rel":0.001435,"nesting":1,"category":"pack-objects","label":"enumerate-objects"}
{"event":"region_enter","thread":"main","repo":1,"nesting":1,"category":"pack-objects","label":"prepare-pack"}
{"event":"region_enter","thread":"main","repo":1,"nesting":2,"category":"progress","label":"Counting objects"}
{"event":"region_leave","thread":"main","repo":1,"t_rel":0.000075,"nesting":2,"category":"progress","label":"Counting objects"}
{"event":"region_leave","thread":"main","repo":1,"t_rel":0.000090,"nesting":1,"category":"pack-objects","label":"prepare-pack"}
{"event":"region_enter","thread":"main","repo":1,"nesting":1,"category":"pack-objects","label":"write-pack-file"}
{"event":"region_enter","thread":"main",,"repo":1,"nesting":2,"category":"progress","label":"Writing objects"}
{"event":"region_leave","thread":"main","repo":1,"t_rel":0.003972,"nesting":2,"category":"progress","label":"Writing objects"}
{"event":"data","thread":"main","repo":1,"t_abs":0.008451,"t_rel":0.003996,"nesting":2,"category":"pack-objects","key":"write_pack_file/wrote","value":"1"}
{"event":"region_leave","thread":"main","repo":1,"t_rel":0.004003,"nesting":1,"category":"pack-objects","label":"write-pack-file"}
{"event":"data","thread":"main","repo":1,"t_abs":0.008482,"t_rel":0.008482,"nesting":1,"category":"fsync","key":"fsync/writeout-only","value":"2"}
{"event":"exit","thread":"main","t_abs":0.008489,"code":0}
{"event":"atexit","thread":"main","t_abs":0.008495,"code":0}
```

These events are collected and then parsed into a tree representation in Gitaly.
Each node of the tree represents an event. When the parser
encounters `region_enter` or `child_start` events, it creates a sub-tree containing
the nested nodes.

This tree is stored as a `trace2.Trace` struct. It is passed
into the `Handle(context.Context, *trace2.Trace)` function of `trace2.Hook`.
During development, we can determine its actual structure by either putting a
debugger or printing the structure to console.

Each node contains the following information:

- `Name`: The name of the node, which is a combination of the category and event
  name. For example,`pack-objects:prepare-pack`. Events with a data type have a
  special prefix, such as`data:pack-objects:write_pack_file/wrote`.
- `StartTime` and `FinishTime`: Timestamps indicating when a node starts and
  finishes.
- `Metadata`: Contains interesting metadata extracted from the corresponding
  event. If the node is a data node, the value of the data is accessible
  through `node.Metadata["Metadata"]`.

The above list of events is converted to the following tree:

```mermaid
flowchart LR
root-->version
root-->start
root-->def_repo
root-->pack-objects:enumerate-objects
pack-objects:enumerate-objects-->progress:Enumeratingobjects["progress:Enumerating objects"]
root-->pack-objects:prepare-pack
pack-objects:prepare-pack-->progress:Countingobjects["progress:Counting objects"]
root-->pack-objects:write-pack-file
pack-objects:write-pack-file-->progress:WritingObjects[progress:Writing objects]
pack-objects:write-pack-file-->data:pack-objects:write_pack_file[data:pack-objects:write_pack_file/wrote]
root-->data:fsync:fsync/writeout-only
```

<details>
<summary>This is an example Go structure of a `git-status` command</summary>

```plaintext
(*trace2.Trace)(0x1400031e2d0)({
 Name: (string) (len=4) "root",
 StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
 FinishTime: (time.Time) 2023-02-21 08:10:10.687877 +0000 UTC,
 Metadata: (map[string]string) (len=1) {
  (string) (len=4) "code": (string) (len=1) "0"
 },
 Children: ([]*trace2.Trace) (len=13 cap=16) {
  (*trace2.Trace)(0x1400031e360)({
   Name: (string) (len=7) "version",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) <nil>,
  }),
  (*trace2.Trace)(0x1400031e3f0)({
   Name: (string) (len=5) "start",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) (len=1) {
    (string) (len=4) "argv": (string) (len=10) "git status"
   },
  }),
  (*trace2.Trace)(0x1400031e480)({
   Name: (string) (len=8) "def_repo",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) <nil>,
  }),
  (*trace2.Trace)(0x1400031e510)({
   Name: (string) (len=19) "index:do_read_index",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) (len=1) {
    (string) (len=3) "msg": (string) (len=10) ".git/index"
   },
   Children: ([]*trace2.Trace) (len=3 cap=4) {
    (*trace2.Trace)(0x1400031e5a0)({
     Name: (string) (len=15) "cache_tree:read",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
    }),
    (*trace2.Trace)(0x1400031e630)({
     Name: (string) (len=23) "data:index:read/version",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     Metadata: (map[string]string) (len=1) {
      (string) (len=4) "data": (string) (len=1) "2"
     },
    }),
    (*trace2.Trace)(0x1400031e750)({
     Name: (string) (len=24) "data:index:read/cache_nr",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     Metadata: (map[string]string) (len=1) {
      (string) (len=4) "data": (string) (len=4) "1585"
     },
    })
   },
   Depth: (int) 1
  }),
  (*trace2.Trace)(0x1400031e870)({
   Name: (string) (len=22) "progress:Refresh index",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Children: ([]*trace2.Trace) (len=3 cap=4) {
    (*trace2.Trace)(0x1400031e900)({
     Name: (string) (len=13) "index:preload",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
    }),
    (*trace2.Trace)(0x1400031e990)({
     Name: (string) (len=13) "index:refresh",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
    }),
    (*trace2.Trace)(0x1400031ea20)({
    ,
     Name: (string) (len=27) "data:progress:total_objects",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     Metadata: (map[string]string) (len=1) {
      (string) (len=4) "data": (string) (len=4) "1585"
     },
    })
   },
  }),
  (*trace2.Trace)(0x1400031eb40)({
   Name: (string) (len=16) "status:worktrees",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) <nil>,
   Children: ([]*trace2.Trace) (len=2 cap=2) {
    (*trace2.Trace)(0x1400031ebd0)({
     Name: (string) (len=10) "diff:setup",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
    }),
    (*trace2.Trace)(0x1400031ec60)({
     Name: (string) (len=24) "diff:write back to queue",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
    })
   },
  }),
  (*trace2.Trace)(0x1400031ecf0)({
   Name: (string) (len=12) "status:index",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) <nil>,
   Children: ([]*trace2.Trace) (len=3 cap=4) {
    (*trace2.Trace)(0x1400031ed80)({
     Name: (string) (len=25) "unpack_trees:unpack_trees",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
    }),
    (*trace2.Trace)(0x1400031ee10)({
     Name: (string) (len=10) "diff:setup",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     Metadata: (map[string]string) <nil>,
    }),
    (*trace2.Trace)(0x1400031eea0)({
     Name: (string) (len=24) "diff:write back to queue",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
    })
   },
  }),
  (*trace2.Trace)(0x1400031ef30)({
   Name: (string) (len=16) "status:untracked",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Children: ([]*trace2.Trace) (len=1 cap=1) {
    (*trace2.Trace)(0x1400031efc0)({
     Name: (string) (len=18) "dir:read_directory",
     StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
     FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
    })
   },
  }),
  (*trace2.Trace)(0x1400031f050)({
   Name: (string) (len=25) "data:status:count/changed",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) (len=1) {
    (string) (len=4) "data": (string) (len=1) "0"
   },
  }),
  (*trace2.Trace)(0x1400031f170)({
   Name: (string) (len=27) "data:status:count/untracked",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) (len=1) {
    (string) (len=4) "data": (string) (len=1) "1"
   },
  }),
  (*trace2.Trace)(0x1400031f290)({
   Name: (string) (len=25) "data:status:count/ignored",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) (len=1) {
    (string) (len=4) "data": (string) (len=1) "0"
   },
  }),
  (*trace2.Trace)(0x1400031f3b0)({
   Name: (string) (len=12) "status:print",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) <nil>,
  }),
  (*trace2.Trace)(0x1400031f440)({
   Name: (string) (len=35) "data_json:traverse_trees:statistics",
   StartTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   FinishTime: (time.Time) 2023-02-21 08:10:10.668546 +0000 UTC,
   Metadata: (map[string]string) (len=1) {
    (string) (len=4) "data": (string) (len=55) "{\"traverse_trees_count\":1,\"traverse_trees_max_depth\":1}"
   },
  })
 },
}
```

</details>

More information about this can be found in the next section.

## How to use the data

As mentioned above, we have implemented a form of "hooks" into the tree. Each
hook implements
the [`trace2.Hook`](https://gitlab.com/gitlab-org/gitaly/-/blob/d15b9c84faee3eb178e7c7d9360832f26d4107a2/internal/git/trace2/hook.go#L6-6)
interface, which enables the hook to walk the tree, filter the corresponding
node, and extract interesting information. Here are the draft implementations
that are referenced:

- [Tracing: Have a deeper look into Git process with trace2](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5441/diffs)
- [Expose internal pack objects metrics with trace2](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5442/diffs)

Let's now walk through a simple scenario. Assuming that we need to identify why
a simple `git-status` command is slow, the following hook captures two things:

- Time spent reading the index.
- Index information extraction, such as `version` and `cache_nr`.

```go
type gitStatusHook struct{}

func (h *gitStatusHook) Name() string {
  return "git_status_hook"
}

func (h *gitStatusHook) Handle(rootCtx context.Context, trace *trace2.Trace) error {
  trace.Walk(rootCtx, func(ctx context.Context, trace *trace2.Trace) context.Context {
    switch trace.Name {
    case "index:do_read_index":
      fmt.Printf("Duration of reading index: %v", trace.FinishTime.Sub(trace.StartTime))
    case "data:index:read/cache_nr":
      fmt.Printf("Changed: %v", trace.Metadata["data"])
    }
    return ctx
  })
  return nil
}
```

We must also add the hook to the
[default list of Trace2 hooks](https://gitlab.com/gitlab-org/gitaly/-/blob/d15b9c84faee3eb178e7c7d9360832f26d4107a2/internal/git/command_factory.go#L90-99)
to make it effective.

## Custom Trace2 instrumentation

Although Trace comes bundled with numerous events and data, they may not be
sufficient for certain purposes. Fortunately, Trace2 provides a safe framework
that allows for the injection of custom events.
The [Git Trace2 header file](https://github.com/git/git/blob/9857273be005833c71e2d16ba48e193113e12276/trace2.h#L43)
thoroughly describes this framework. You can search for Trace2 usage in
the [Git source code](https://github.com/search?q=repo%3Agit%2Fgit+trace2_data&type=code).

To inject custom events, you must:

1. Add `trace2_*()` function calls to the
   [GitLab fork of Git](https://gitlab.com/gitlab-org/git.git).
1. Follow the release process.

The list of Trace2 functions can be found in the
[`git/trace2.h`](https://github.com/git/git/blob/9857273be005833c71e2d16ba48e193113e12276/trace2.h#L43)
header file. The comments above each function declaration contain additional
information about how to use them.

For example, the following code snippet injects a custom region event and emits
two string data to Git worktree setup.

```c
void setup_work_tree(void)
{
  trace2_region_enter("gitlab-work-tree", "setup", the_repository);

  // Do something
  trace2_data_string("gitlab-work-tree", the_repository, "realpath-path", tmp_original_cwd);
  trace2_data_string("gitlab-work-tree", the_repository, "realpath-failure", strerror(errno));

  // Do something else
  trace2_region_leave("gitlab-work-tree", "setup", the_repository);
}
```

After it's done, the events are automatically collected by following the
convention illustrated in the above section. It's trivial to extract such
information in Gitaly:

```go
switch trace.Name {
    case "gitlab-work-tree:setup":
        fmt.Print(trace.FinishTime.Sub(trace.StartTime).Seconds())
    case "data:gitlab-work-tree:realpath-path":
        fmt.Print(trace.Metadata["data"])
    case "data:gitlab-work-tree:realpath-failure":
        m t.Print(trace.Metadata["data"])
    }
}
```

Because Trace2 is lightweight and well-isolated, it is relatively safe to add
custom events. More importantly, adding custom events does not affect the
core functionality of Git.

Nonetheless, when adding custom events, avoid adding events in a loop or outputting
sensitive data such as file content.