Welcome to mirror list, hosted at ThFree Co, Russian Federation.

sidekiq_memory_killer.md « sidekiq « administration « doc - gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: cd69a47d2831787e4b6348945692dcc51912c2fa (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
stage: Data Stores
group: Cloud Connector
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
---

# Reducing memory use

The GitLab Rails application code suffers from memory leaks. For web requests
this problem is made manageable using a [supervision thread](../operations/puma.md#reducing-memory-use)
that automatically restarts workers if they exceed a given resident set size (RSS) threshold
for a certain amount of time.
We use the same approach to the Sidekiq processes used by GitLab
to process background jobs.

GitLab monitors the available RSS limit by default only for Linux package or Docker installations. The reason for this
is that GitLab relies on runit to restart Sidekiq after a memory-induced shutdown, and self-compiled and Helm chart
installations don't use runit or an equivalent tool.

With the default settings, Sidekiq restarts no
more often than once every 15 minutes, with the restart causing about one
minute of delay for incoming background jobs.

Some background jobs rely on long-running external processes. To ensure these
are cleanly terminated when Sidekiq is restarted, each Sidekiq process should be
run as a process group leader (for example, using `chpst -P`). If using a Linux package installation or the
`bin/background_jobs` script with `runit` installed, this is handled for you.

## Configuring the limits

Sidekiq memory limits are controlled using environment variables.

- `SIDEKIQ_MEMORY_KILLER_MAX_RSS` (KB): defines the Sidekiq process soft limit for allowed RSS.
  If the Sidekiq process RSS (expressed in kilobytes) exceeds `SIDEKIQ_MEMORY_KILLER_MAX_RSS`,
  for longer than `SIDEKIQ_MEMORY_KILLER_GRACE_TIME`, the graceful restart is triggered.
  If `SIDEKIQ_MEMORY_KILLER_MAX_RSS` is not set, or its value is set to 0, the soft limit is not monitored.
  `SIDEKIQ_MEMORY_KILLER_MAX_RSS` defaults to `2000000`.

- `SIDEKIQ_MEMORY_KILLER_GRACE_TIME`: defines the grace time period in seconds for which the Sidekiq process is allowed to run
  above the allowed RSS soft limit. If the Sidekiq process goes below the allowed RSS (soft limit)
  within `SIDEKIQ_MEMORY_KILLER_GRACE_TIME`, the restart is aborted. Default value is 900 seconds (15 minutes).

- `SIDEKIQ_MEMORY_KILLER_HARD_LIMIT_RSS` (KB): defines the Sidekiq process hard limit for allowed RSS.
  If the Sidekiq process RSS (expressed in kilobytes) exceeds `SIDEKIQ_MEMORY_KILLER_HARD_LIMIT_RSS`,
  an immediate graceful restart of Sidekiq is triggered. If this value is not set, or set to 0,
  the hard limit is not be monitored.

- `SIDEKIQ_MEMORY_KILLER_CHECK_INTERVAL`: defines how often to check the process RSS. Defaults to 3 seconds.

- `SIDEKIQ_MEMORY_KILLER_SHUTDOWN_WAIT`: defines the maximum time allowed for all Sidekiq jobs to finish.
  No new jobs are accepted during that time. Defaults to 30 seconds.

  If the process restart is not performed by Sidekiq, the Sidekiq process is forcefully terminated after
  [Sidekiq shutdown timeout](https://github.com/mperham/sidekiq/wiki/Signals#term) (defaults to 25 seconds) +2 seconds.
  If jobs do not finish during that time, all currently running jobs are interrupted with a `SIGTERM` signal
  sent to the Sidekiq process.

- `GITLAB_MEMORY_WATCHDOG_ENABLED`: enabled by default. Set the `GITLAB_MEMORY_WATCHDOG_ENABLED` to false, to disable Watchdog from running.

### Monitor worker restarts

GitLab emits log events if workers are restarted due to high memory usage.

The following is an example of one of these log events in `/var/log/gitlab/gitlab-rails/sidekiq_client.log`:

```json
{
  "severity": "WARN",
  "time": "2023-02-04T09:45:16.173Z",
  "correlation_id": null,
  "pid": 2725,
  "worker_id": "sidekiq_1",
  "memwd_handler_class": "Gitlab::Memory::Watchdog::SidekiqHandler",
  "memwd_sleep_time_s": 3,
  "memwd_rss_bytes": 1079683247,
  "memwd_max_rss_bytes": 629145600,
  "memwd_max_strikes": 5,
  "memwd_cur_strikes": 6,
  "message": "rss memory limit exceeded",
  "running_jobs": [
    {
      jid: "83efb701c59547ee42ff7068",
      worker_class: "Ci::DeleteObjectsWorker"
    },
    {
      jid: "c3a74503dc2637f8f9445dd3",
      worker_class: "Ci::ArchiveTraceWorker"
    }
  ]
}
```

Where:

- `memwd_rss_bytes` is the actual amount of memory consumed.
- `memwd_max_rss_bytes` is the RSS limit set through `per_worker_max_memory_mb`.
- `running jobs` lists the jobs that were running at the time when the process
  exceeded the RSS limit and started a graceful restart.