Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/sidekiq')
-rw-r--r--doc/administration/sidekiq/extra_sidekiq_processes.md362
-rw-r--r--doc/administration/sidekiq/extra_sidekiq_routing.md180
-rw-r--r--doc/administration/sidekiq/img/sidekiq_flamegraph.pngbin0 -> 54473 bytes
-rw-r--r--doc/administration/sidekiq/index.md403
-rw-r--r--doc/administration/sidekiq/sidekiq_health_check.md58
-rw-r--r--doc/administration/sidekiq/sidekiq_job_migration.md40
-rw-r--r--doc/administration/sidekiq/sidekiq_memory_killer.md82
-rw-r--r--doc/administration/sidekiq/sidekiq_troubleshooting.md381
8 files changed, 1506 insertions, 0 deletions
diff --git a/doc/administration/sidekiq/extra_sidekiq_processes.md b/doc/administration/sidekiq/extra_sidekiq_processes.md
new file mode 100644
index 00000000000..1cd3771c94d
--- /dev/null
+++ b/doc/administration/sidekiq/extra_sidekiq_processes.md
@@ -0,0 +1,362 @@
+---
+stage: Systems
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Run multiple Sidekiq processes **(FREE SELF)**
+
+GitLab allows you to start multiple Sidekiq processes.
+These processes can be used to consume a dedicated set
+of queues. This can be used to ensure certain queues always have dedicated
+workers, no matter the number of jobs to be processed.
+
+NOTE:
+The information in this page applies only to Omnibus GitLab.
+
+## Available Sidekiq queues
+
+For a list of the existing Sidekiq queues, check the following files:
+
+- [Queues for both GitLab Community and Enterprise Editions](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/all_queues.yml)
+- [Queues for GitLab Enterprise Editions only](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/all_queues.yml)
+
+Each entry in the above files represents a queue on which Sidekiq processes
+can be started.
+
+## Start multiple processes
+
+> - [Introduced](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/4006) in GitLab 12.10, starting multiple processes with Sidekiq cluster.
+> - [Sidekiq cluster moved](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/181) to GitLab Free in 12.10.
+> - [Sidekiq cluster became default](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/4140) in GitLab 13.0.
+
+When starting multiple processes, the number of processes should
+equal (and **not** exceed) the number of CPU cores you want to
+dedicate to Sidekiq. Each Sidekiq process can use only 1 CPU
+core, subject to the available workload and concurrency settings.
+
+To start multiple processes:
+
+1. Using the `sidekiq['queue_groups']` array setting, specify how many processes to
+ create using `sidekiq-cluster` and which queue they should handle.
+ Each item in the array equates to one additional Sidekiq
+ process, and values in each item determine the queues it works on.
+
+ For example, the following setting creates three Sidekiq processes, one to run on
+ `elastic_commit_indexer`, one to run on `mailers`, and one process running on all queues:
+
+ ```ruby
+ sidekiq['queue_groups'] = [
+ "elastic_commit_indexer",
+ "mailers",
+ "*"
+ ]
+ ```
+
+ To have an additional Sidekiq process handle multiple queues, add multiple
+ queue names to its item delimited by commas. For example:
+
+ ```ruby
+ sidekiq['queue_groups'] = [
+ "elastic_commit_indexer, elastic_association_indexer",
+ "mailers",
+ "*"
+ ]
+ ```
+
+ [In GitLab 12.9](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26594) and
+ later, the special queue name `*` means all queues. This starts two
+ processes, each handling all queues:
+
+ ```ruby
+ sidekiq['queue_groups'] = [
+ "*",
+ "*"
+ ]
+ ```
+
+ `*` cannot be combined with concrete queue names - `*, mailers`
+ just handles the `mailers` queue.
+
+ When `sidekiq-cluster` is only running on a single node, make sure that at least
+ one process is running on all queues using `*`. This ensures a process
+ automatically picks up jobs in queues created in the future,
+ including queues that have dedicated processes.
+
+ If `sidekiq-cluster` is running on more than one node, you can also use
+ [`--negate`](#negate-settings) and list all the queues that are already being
+ processed.
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+To view the Sidekiq processes in GitLab:
+
+1. On the top bar, select **Menu > Admin**.
+1. On the left sidebar, select **Monitoring > Background Jobs**.
+
+## Negate settings
+
+To have the Sidekiq process work on every queue **except** the ones
+you list. In this example, we exclude all import-related jobs from a Sidekiq node:
+
+1. Edit `/etc/gitlab/gitlab.rb` and add:
+
+ ```ruby
+ sidekiq['negate'] = true
+ sidekiq['queue_selector'] = true
+ sidekiq['queue_groups'] = [
+ "feature_category=importers"
+ ]
+ ```
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+## Queue selector
+
+> - [Introduced](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/45) in GitLab 12.8.
+> - [Sidekiq cluster, including queue selector, moved](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/181) to GitLab Free in 12.10.
+> - [Renamed from `experimental_queue_selector` to `queue_selector`](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/147) in GitLab 13.6.
+
+In addition to selecting queues by name, as above, the `queue_selector` option
+allows queue groups to be selected in a more general way using a
+[worker matching query](extra_sidekiq_routing.md#worker-matching-query). After `queue_selector`
+is set, all `queue_groups` must follow the aforementioned syntax.
+
+In `/etc/gitlab/gitlab.rb`:
+
+```ruby
+sidekiq['enable'] = true
+sidekiq['queue_selector'] = true
+sidekiq['queue_groups'] = [
+ # Run all non-CPU-bound queues that are high urgency
+ 'resource_boundary!=cpu&urgency=high',
+ # Run all continuous integration and pages queues that are not high urgency
+ 'feature_category=continuous_integration,pages&urgency!=high',
+ # Run all queues
+ '*'
+]
+```
+
+## Ignore all import queues
+
+When [importing from GitHub](../../user/project/import/github.md) or
+other sources, Sidekiq might use all of its resources to perform those
+operations. To set up two separate `sidekiq-cluster` processes, where
+one only processes imports and the other processes all other queues:
+
+1. Edit `/etc/gitlab/gitlab.rb` and add:
+
+ ```ruby
+ sidekiq['enable'] = true
+ sidekiq['queue_selector'] = true
+ sidekiq['queue_groups'] = [
+ "feature_category=importers",
+ "feature_category!=importers"
+ ]
+ ```
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+## Number of threads
+
+By default each process defined under `sidekiq` starts with a
+number of threads that equals the number of queues, plus one spare thread.
+For example, a process that handles the `process_commit` and `post_receive`
+queues uses three threads in total.
+
+These thread run inside a single Ruby process, and each process
+can only use a single CPU core. The usefulness of threading depends
+on the work having some external dependencies to wait on, like database queries or
+HTTP requests. Most Sidekiq deployments benefit from this threading, and when
+running fewer queues in a process, increasing the thread count might be
+even more desirable to make the most effective use of CPU resources.
+
+### Manage thread counts explicitly
+
+The correct maximum thread count (also called concurrency) depends on the workload.
+Typical values range from `1` for highly CPU-bound tasks to `15` or higher for mixed
+low-priority work. A reasonable starting range is `15` to `25` for a non-specialized
+deployment.
+
+You can find example values used by GitLab.com by searching for `concurrency:` in
+[the Helm charts](https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/blob/master/releases/gitlab/values/gprd.yaml.gotmpl).
+The values vary according to the work each specific deployment of Sidekiq does.
+Any other specialized deployments with processes dedicated to specific queues should
+have the concurrency tuned according to:
+have the concurrency tuned according to:
+
+- The CPU usage of each type of process.
+- The throughput achieved.
+
+Each thread requires a Redis connection, so adding threads may increase Redis
+latency and potentially cause client timeouts. See the
+[Sidekiq documentation about Redis](https://github.com/mperham/sidekiq/wiki/Using-Redis) for more
+details.
+
+#### When running Sidekiq cluster (default)
+
+Running Sidekiq cluster is the default in GitLab 13.0 and later.
+
+1. Edit `/etc/gitlab/gitlab.rb` and add:
+
+ ```ruby
+ sidekiq['min_concurrency'] = 15
+ sidekiq['max_concurrency'] = 25
+ ```
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+`min_concurrency` and `max_concurrency` are independent; one can be set without
+the other. Setting `min_concurrency` to `0` disables the limit.
+
+For each queue group, let `N` be one more than the number of queues. The
+concurrency is set to:
+
+1. `N`, if it's between `min_concurrency` and `max_concurrency`.
+1. `max_concurrency`, if `N` exceeds this value.
+1. `min_concurrency`, if `N` is less than this value.
+
+If `min_concurrency` is equal to `max_concurrency`, then this value is used
+regardless of the number of queues.
+
+When `min_concurrency` is greater than `max_concurrency`, it is treated as
+being equal to `max_concurrency`.
+
+#### When running a single Sidekiq process
+
+Running a single Sidekiq process is the default in GitLab 12.10 and earlier.
+
+WARNING:
+Running Sidekiq directly was removed in GitLab
+[14.0](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/240).
+
+1. Edit `/etc/gitlab/gitlab.rb` and add:
+
+ ```ruby
+ sidekiq['cluster'] = false
+ sidekiq['concurrency'] = 25
+ ```
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+This sets the concurrency (number of threads) for the Sidekiq process.
+
+## Modify the check interval
+
+To modify `sidekiq-cluster`'s health check interval for the additional Sidekiq processes:
+
+1. Edit `/etc/gitlab/gitlab.rb` and add (the value can be any integer number of seconds):
+
+ ```ruby
+ sidekiq['interval'] = 5
+ ```
+
+1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
+
+## Troubleshoot using the CLI
+
+WARNING:
+It's recommended to use `/etc/gitlab/gitlab.rb` to configure the Sidekiq processes.
+If you experience a problem, you should contact GitLab support. Use the command
+line at your own risk.
+
+For debugging purposes, you can start extra Sidekiq processes by using the command
+`/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster`. This command
+takes arguments using the following syntax:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster [QUEUE,QUEUE,...] [QUEUE, ...]
+```
+
+Each separate argument denotes a group of queues that have to be processed by a
+Sidekiq process. Multiple queues can be processed by the same process by
+separating them with a comma instead of a space.
+
+Instead of a queue, a queue namespace can also be provided, to have the process
+automatically listen on all queues in that namespace without needing to
+explicitly list all the queue names. For more information about queue namespaces,
+see the relevant section in the
+[Sidekiq development documentation](../../development/sidekiq/index.md#queue-namespaces).
+
+For example, say you want to start 2 extra processes: one to process the
+`process_commit` queue, and one to process the `post_receive` queue. This can be
+done as follows:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit post_receive
+```
+
+If you instead want to start one process processing both queues, you'd use the
+following syntax:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit,post_receive
+```
+
+If you want to have one Sidekiq process dealing with the `process_commit` and
+`post_receive` queues, and one process to process the `gitlab_shell` queue,
+you'd use the following:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit,post_receive gitlab_shell
+```
+
+### Monitor the `sidekiq-cluster` command
+
+The `sidekiq-cluster` command does not terminate once it has started the desired
+amount of Sidekiq processes. Instead, the process continues running and
+forwards any signals to the child processes. This allows you to stop all
+Sidekiq processes as you send a signal to the `sidekiq-cluster` process,
+instead of having to send it to the individual processes.
+
+If the `sidekiq-cluster` process crashes or receives a `SIGKILL`, the child
+processes terminate themselves after a few seconds. This ensures you don't
+end up with zombie Sidekiq processes.
+
+This allows you to monitor the processes by hooking up
+`sidekiq-cluster` to your supervisor of choice (for example, runit).
+
+If a child process died the `sidekiq-cluster` command signals all remaining
+process to terminate, then terminate itself. This removes the need for
+`sidekiq-cluster` to re-implement complex process monitoring/restarting code.
+Instead you should make sure your supervisor restarts the `sidekiq-cluster`
+process whenever necessary.
+
+### PID files
+
+The `sidekiq-cluster` command can store its PID in a file. By default no PID
+file is written, but this can be changed by passing the `--pidfile` option to
+`sidekiq-cluster`. For example:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster --pidfile /var/run/gitlab/sidekiq_cluster.pid process_commit
+```
+
+Keep in mind that the PID file contains the PID of the `sidekiq-cluster`
+command and not the PIDs of the started Sidekiq processes.
+
+### Environment
+
+The Rails environment can be set by passing the `--environment` flag to the
+`sidekiq-cluster` command, or by setting `RAILS_ENV` to a non-empty value. The
+default value can be found in `/opt/gitlab/etc/gitlab-rails/env/RAILS_ENV`.
diff --git a/doc/administration/sidekiq/extra_sidekiq_routing.md b/doc/administration/sidekiq/extra_sidekiq_routing.md
new file mode 100644
index 00000000000..ebbf4059290
--- /dev/null
+++ b/doc/administration/sidekiq/extra_sidekiq_routing.md
@@ -0,0 +1,180 @@
+---
+stage: Systems
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Queue routing rules **(FREE SELF)**
+
+When the number of Sidekiq jobs increases to a certain scale, the system faces
+some scalability issues. One of them is that the length of the queue tends to get
+longer. High-urgency jobs have to wait longer until other less urgent jobs
+finish. This head-of-line blocking situation may eventually affect the
+responsiveness of the system, especially critical actions. In another scenario,
+the performance of some jobs is degraded due to other long running or CPU-intensive jobs
+(computing or rendering ones) in the same machine.
+
+To counter the aforementioned issues, one effective solution is to split
+Sidekiq jobs into different queues and assign machines handling each queue
+exclusively. For example, all CPU-intensive jobs could be routed to the
+`cpu-bound` queue and handled by a fleet of CPU optimized instances. The queue
+topology differs between companies depending on the workloads and usage
+patterns. Therefore, GitLab supports a flexible mechanism for the
+administrator to route the jobs based on their characteristics.
+
+As an alternative to [Queue selector](extra_sidekiq_processes.md#queue-selector), which
+configures Sidekiq cluster to listen to a specific set of workers or queues,
+GitLab also supports routing a job from a worker to the desired queue when it
+is scheduled. Sidekiq clients try to match a job against a configured list of
+routing rules. Rules are evaluated from first to last, and as soon as we find a
+match for a given worker we stop processing for that worker (first match wins).
+If the worker doesn't match any rule, it falls back to the queue name generated
+from the worker name.
+
+By default, if the routing rules are not configured (or denoted with an empty
+array), all the jobs are routed to the queue generated from the worker name.
+
+## Example configuration
+
+In `/etc/gitlab/gitlab.rb`:
+
+```ruby
+sidekiq['routing_rules'] = [
+ # Do not re-route workers that require their own queue
+ ['tags=needs_own_queue', nil],
+ # Route all non-CPU-bound workers that are high urgency to `high-urgency` queue
+ ['resource_boundary!=cpu&urgency=high', 'high-urgency'],
+ # Route all database, gitaly and global search workers that are throttled to `throttled` queue
+ ['feature_category=database,gitaly,global_search&urgency=throttled', 'throttled'],
+ # Route all workers having contact with outside work to a `network-intenstive` queue
+ ['has_external_dependencies=true|feature_category=hooks|tags=network', 'network-intensive'],
+ # Route all import workers to the queues generated by the worker name, for
+ # example, JiraImportWorker to `jira_import`, SVNWorker to `svn_worker`
+ ['feature_category=import', nil],
+ # Wildcard matching, route the rest to `default` queue
+ ['*', 'default']
+]
+```
+
+The routing rules list is an order-matter array of tuples of query and
+corresponding queue:
+
+- The query is following a [worker matching query](#worker-matching-query) syntax.
+- The `<queue_name>` must be a valid Sidekiq queue name. If the queue name
+ is `nil`, or an empty string, the worker is routed to the queue generated
+ by the name of the worker instead.
+
+The query supports wildcard matching `*`, which matches all workers. As a
+result, the wildcard query must stay at the end of the list or the rules after it
+are ignored.
+
+NOTE:
+Mixing queue routing rules and queue selectors requires care to
+ensure all jobs that are scheduled and picked up by appropriate Sidekiq
+workers.
+
+## Worker matching query
+
+GitLab provides a query syntax to match a worker based on its
+attributes. This query syntax is employed by both
+[Queue routing rules](#queue-routing-rules) and
+[Queue selector](extra_sidekiq_processes.md#queue-selector). A query includes two
+components:
+
+- Attributes that can be selected.
+- Operators used to construct a query.
+
+### Available attributes
+
+> [Introduced](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/261) in GitLab 13.1 (`tags`).
+
+Queue matching query works upon the worker attributes, described in
+[Sidekiq style guide](../../development/sidekiq/index.md). We support querying
+based on a subset of worker attributes:
+
+- `feature_category` - the
+ [GitLab feature category](https://about.gitlab.com/direction/maturity/#category-maturity) the
+ queue belongs to. For example, the `merge` queue belongs to the
+ `source_code_management` category.
+- `has_external_dependencies` - whether or not the queue connects to external
+ services. For example, all importers have this set to `true`.
+- `urgency` - how important it is that this queue's jobs run
+ quickly. Can be `high`, `low`, or `throttled`. For example, the
+ `authorized_projects` queue is used to refresh user permissions, and
+ is `high` urgency.
+- `worker_name` - the worker name. Use this attribute to select a specific worker.
+- `name` - the queue name generated from the worker name. Use this attribute to select a specific queue. Because this is generated from
+ the worker name, it does not change based on the result of other routing
+ rules.
+- `resource_boundary` - if the queue is bound by `cpu`, `memory`, or
+ `unknown`. For example, the `ProjectExportWorker` is memory bound as it has
+ to load data in memory before saving it for export.
+- `tags` - short-lived annotations for queues. These are expected to frequently
+ change from release to release, and may be removed entirely.
+
+`has_external_dependencies` is a boolean attribute: only the exact
+string `true` is considered true, and everything else is considered
+false.
+
+`tags` is a set, which means that `=` checks for intersecting sets, and
+`!=` checks for disjoint sets. For example, `tags=a,b` selects queues
+that have tags `a`, `b`, or both. `tags!=a,b` selects queues that have
+neither of those tags.
+
+The attributes of each worker are hard-coded in the source code. For
+convenience, we generate a
+[list of all available attributes in GitLab Community Edition](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/all_queues.yml)
+and a
+[list of all available attributes in GitLab Enterprise Edition](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/all_queues.yml).
+
+### Available operators
+
+`queue_selector` supports the following operators, listed from highest
+to lowest precedence:
+
+- `|` - the logical `OR` operator. For example, `query_a|query_b` (where `query_a`
+ and `query_b` are queries made up of the other operators here) includes
+ queues that match either query.
+- `&` - the logical `AND` operator. For example, `query_a&query_b` (where
+ `query_a` and `query_b` are queries made up of the other operators here) will
+ only include queues that match both queries.
+- `!=` - the `NOT IN` operator. For example, `feature_category!=issue_tracking`
+ excludes all queues from the `issue_tracking` feature category.
+- `=` - the `IN` operator. For example, `resource_boundary=cpu` includes all
+ queues that are CPU bound.
+- `,` - the concatenate set operator. For example,
+ `feature_category=continuous_integration,pages` includes all queues from
+ either the `continuous_integration` category or the `pages` category. This
+ example is also possible using the OR operator, but allows greater brevity, as
+ well as being lower precedence.
+
+The operator precedence for this syntax is fixed: it's not possible to make `AND`
+have higher precedence than `OR`.
+
+[In GitLab 12.9](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26594) and
+later, as with the standard queue group syntax above, a single `*` as the
+entire queue group selects all queues.
+
+### Migration
+
+After the Sidekiq routing rules are changed, administrators must take care
+with the migration to avoid losing jobs entirely, especially in a system with
+long queues of jobs. The migration can be done by following the migration steps
+mentioned in [Sidekiq job migration](sidekiq_job_migration.md)
+
+### Workers that cannot be migrated
+
+Some workers cannot share a queue with other workers - typically because
+they check the size of their own queue - and so must be excluded from
+this process. We recommend excluding these from any further worker
+routing by adding a rule to keep them in their own queue, for example:
+
+```ruby
+sidekiq['routing_rules'] = [
+ ['tags=needs_own_queue', nil],
+ # ...
+]
+```
+
+These queues must also be included in at least one
+[Sidekiq queue group](extra_sidekiq_processes.md#start-multiple-processes).
diff --git a/doc/administration/sidekiq/img/sidekiq_flamegraph.png b/doc/administration/sidekiq/img/sidekiq_flamegraph.png
new file mode 100644
index 00000000000..89d6e8da3ce
--- /dev/null
+++ b/doc/administration/sidekiq/img/sidekiq_flamegraph.png
Binary files differ
diff --git a/doc/administration/sidekiq/index.md b/doc/administration/sidekiq/index.md
new file mode 100644
index 00000000000..f4e67fcb6dd
--- /dev/null
+++ b/doc/administration/sidekiq/index.md
@@ -0,0 +1,403 @@
+---
+stage: Systems
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Configure an external Sidekiq instance **(FREE SELF)**
+
+You can configure an external Sidekiq instance by using the Sidekiq that's bundled in the GitLab package. Sidekiq requires connection to the Redis,
+PostgreSQL, and Gitaly instances.
+
+## Configure TCP access for PostgreSQL, Gitaly, and Redis
+
+By default, GitLab uses UNIX sockets and is not set up to communicate via TCP. To change this:
+
+1. Edit the `/etc/gitlab/gitlab.rb` file on your GitLab instance, and add the following:
+
+ ```ruby
+
+ ## PostgreSQL
+
+ # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
+ postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH'
+ postgresql['listen_address'] = '0.0.0.0'
+ postgresql['port'] = 5432
+
+ # Add the Sidekiq nodes to PostgreSQL's trusted addresses.
+ # In the following example, 10.10.1.30/32 is the private IP
+ # of the Sidekiq server.
+ postgresql['md5_auth_cidr_addresses'] = %w(127.0.0.1/32 10.10.1.30/32)
+ postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.10.1.30/32)
+
+ ## Gitaly
+
+ # Make Gitaly accept connections on all network interfaces
+ gitaly['listen_addr'] = "0.0.0.0:8075"
+ ## Set up the Gitaly token as a form of authentication since you are accessing Gitaly over the network
+ ## https://docs.gitlab.com/ee/administration/gitaly/configure_gitaly.html#about-the-gitaly-token
+ gitaly['auth_token'] = 'abc123secret'
+ praefect['auth_token'] = 'abc123secret'
+ gitlab_rails['gitaly_token'] = 'abc123secret'
+
+ ## Redis configuration
+
+ redis['bind'] = '0.0.0.0'
+ redis['port'] = 6379
+ # Password to Authenticate Redis
+ redis['password'] = 'redis-password-goes-here'
+ gitlab_rails['redis_password'] = 'redis-password-goes-here'
+
+ gitlab_rails['auto_migrate'] = false
+ ```
+
+1. Run `reconfigure`:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+1. Restart the `PostgreSQL` server:
+
+ ```shell
+ sudo gitlab-ctl restart postgresql
+ ```
+
+1. After the restart, set `auto_migrate` to `true` or comment to use the default settings:
+
+ ```ruby
+ gitlab_rails['auto_migrate'] = true
+ ```
+
+1. Run `reconfigure` again:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+## Set up Sidekiq instance
+
+1. SSH into the Sidekiq server.
+
+1. Confirm that you can access the PostgreSQL, Gitaly, and Redis ports:
+
+ ```shell
+ telnet <GitLab host> 5432 # PostgreSQL
+ telnet <GitLab host> 8075 # Gitaly
+ telnet <GitLab host> 6379 # Redis
+ ```
+
+1. [Download and install](https://about.gitlab.com/install/) the Omnibus GitLab package
+ using steps 1 and 2. **Do not complete any other steps.**
+
+1. Copy the `/etc/gitlab/gitlab.rb` file from the GitLab instance and add the following settings. Make sure
+ to replace them with your values:
+
+<!--
+Updates to example must be made at:
+- https://gitlab.com/gitlab-org/gitlab/blob/master/doc/administration/sidekiq.md
+- all reference architecture pages
+-->
+
+ ```ruby
+ ########################################
+ ##### Services Disabled ###
+ ########################################
+ #
+ # When running GitLab on just one server, you have a single `gitlab.rb`
+ # to enable all services you want to run.
+ # When running GitLab on N servers, you have N `gitlab.rb` files.
+ # Enable only the services you want to run on each
+ # specific server, while disabling all others.
+ #
+ gitaly['enable'] = false
+ postgresql['enable'] = false
+ redis['enable'] = false
+ nginx['enable'] = false
+ puma['enable'] = false
+ gitlab_workhorse['enable'] = false
+ prometheus['enable'] = false
+ alertmanager['enable'] = false
+ grafana['enable'] = false
+ gitlab_exporter['enable'] = false
+ gitlab_kas['enable'] = false
+
+ ##
+ ## To maintain uniformity of links across nodes, the
+ ## `external_url` on the Sidekiq server should point to the external URL that users
+ ## use to access GitLab. This can be either:
+ ##
+ ## - The `external_url` set on your application server.
+ ## - The URL of a external load balancer, which routes traffic to the GitLab application server.
+ ##
+ external_url 'https://gitlab.example.com'
+
+ # Configure the gitlab-shell API callback URL. Without this, `git push` will
+ # fail. This can be your 'front door' GitLab URL or an internal load
+ # balancer.
+ gitlab_rails['internal_api_url'] = 'GITLAB_URL'
+ gitlab_shell['secret_token'] = 'SHELL_TOKEN'
+
+ ########################################
+ #### Redis ###
+ ########################################
+
+ ## Must be the same in every sentinel node.
+ redis['master_name'] = 'gitlab-redis' # Required if you have setup redis cluster
+ ## The same password for Redis authentication you set up for the master node.
+ redis['master_password'] = '<redis_master_password>'
+
+ ### If redis is running on the main Gitlab instance and you have opened the TCP port as above add the following
+ gitlab_rails['redis_host'] = '<gitlab_host>'
+ gitlab_rails['redis_port'] = 6379
+
+ #######################################
+ ### Gitaly ###
+ #######################################
+
+ ## Replace <gitaly_token> with the one you set up, see
+ ## https://docs.gitlab.com/ee/administration/gitaly/configure_gitaly.html#about-the-gitaly-token
+ git_data_dirs({
+ "default" => {
+ "gitaly_address" => "tcp://<gitlab_host>:8075",
+ "gitaly_token" => "<gitaly_token>"
+ }
+ })
+
+ #######################################
+ ### Postgres ###
+ #######################################
+
+ # Replace <database_host> and <database_password>
+ gitlab_rails['db_host'] = '<database_host>'
+ gitlab_rails['db_port'] = '5432'
+ gitlab_rails['db_password'] = '<database_password>'
+ ## Prevent database migrations from running on upgrade automatically
+ gitlab_rails['auto_migrate'] = false
+
+ #######################################
+ ### Sidekiq configuration ###
+ #######################################
+ sidekiq['enable'] = true
+ sidekiq['listen_address'] = "0.0.0.0"
+
+ ## Set number of Sidekiq queue processes to the same number as available CPUs
+ sidekiq['queue_groups'] = ['*'] * 4
+
+ ## Set number of Sidekiq threads per queue process to the recommend number of 10
+ sidekiq['max_concurrency'] = 10
+ ```
+
+1. Copy the `/etc/gitlab/gitlab-secrets.json` file from the GitLab instance and replace the file in the Sidekiq instance.
+
+1. Reconfigure GitLab:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+1. Restart the Sidekiq instance after completing the process and finishing the database migrations.
+
+## Configure multiple Sidekiq nodes with shared storage
+
+If you run multiple Sidekiq nodes with a shared file storage, such as NFS, you must
+specify the UIDs and GIDs to ensure they match between servers. Specifying the UIDs
+and GIDs prevents permissions issues in the file system. This advice is similar to the
+[advice for Geo setups](../geo/replication/multiple_servers.md#step-4-configure-the-frontend-application-nodes-on-the-geo-secondary-site).
+
+To set up multiple Sidekiq nodes:
+
+1. Edit `/etc/gitlab/gitlab.rb`:
+
+ ```ruby
+ user['uid'] = 9000
+ user['gid'] = 9000
+ web_server['uid'] = 9001
+ web_server['gid'] = 9001
+ registry['uid'] = 9002
+ registry['gid'] = 9002
+ ```
+
+1. Reconfigure GitLab:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+## Configure the Container Registry when using an external Sidekiq
+
+If you're using the Container Registry and it's running on a different
+node than Sidekiq, follow the steps below.
+
+1. Edit `/etc/gitlab/gitlab.rb`, and configure the registry URL:
+
+ ```ruby
+ registry_external_url 'https://registry.example.com'
+ gitlab_rails['registry_api_url'] = "https://registry.example.com"
+ ```
+
+1. Reconfigure GitLab:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+1. In the instance where Container Registry is hosted, copy the `registry.key`
+ file to the Sidekiq node.
+
+## Configure the Sidekiq metrics server
+
+If you want to collect Sidekiq metrics, enable the Sidekiq metrics server.
+To make metrics available from `localhost:8082/metrics`:
+
+To configure the metrics server:
+
+1. Edit `/etc/gitlab/gitlab.rb`:
+
+ ```ruby
+ sidekiq['metrics_enabled'] = true
+ sidekiq['listen_address'] = "localhost"
+ sidekiq['listen_port'] = "8082"
+
+ # Optionally log all the metrics server logs to log/sidekiq_exporter.log
+ sidekiq['exporter_log_enabled'] = true
+ ```
+
+1. Reconfigure GitLab:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+### Enable HTTPS
+
+> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/364771) in GitLab 15.2.
+
+To serve metrics via HTTPS instead of HTTP, enable TLS in the exporter settings:
+
+1. Edit `/etc/gitlab/gitlab.rb` to add (or find and uncomment) the following lines:
+
+ ```ruby
+ sidekiq['exporter_tls_enabled'] = true
+ sidekiq['exporter_tls_cert_path'] = "/path/to/certificate.pem"
+ sidekiq['exporter_tls_key_path'] = "/path/to/private-key.pem"
+ ```
+
+1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure)
+ for the changes to take effect.
+
+When TLS is enabled, the same `port` and `address` are used as described above.
+The metrics server cannot serve both HTTP and HTTPS at the same time.
+
+## Configure health checks
+
+If you use health check probes to observe Sidekiq, enable the Sidekiq health check server.
+To make health checks available from `localhost:8092`:
+
+1. Edit `/etc/gitlab/gitlab.rb`:
+
+ ```ruby
+ sidekiq['health_checks_enabled'] = true
+ sidekiq['health_checks_listen_address'] = "localhost"
+ sidekiq['health_checks_listen_port'] = "8092"
+ ```
+
+1. Reconfigure GitLab:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+For more information about health checks, see the [Sidekiq health check page](sidekiq_health_check.md).
+
+## Configure LDAP and user or group synchronization
+
+If you use LDAP for user and group management, you must add the LDAP configuration to your Sidekiq node as well as the LDAP
+synchronization worker. If the LDAP configuration and LDAP synchronization worker are not applied to your Sidekiq node,
+users and groups are not automatically synchronized.
+
+For more information about configuring LDAP for GitLab, see:
+
+- [GitLab LDAP configuration documentation](../auth/ldap/index.md#configure-ldap)
+- [LDAP synchronization documentation](../auth/ldap/ldap_synchronization.md#adjust-ldap-user-sync-schedule)
+
+To enable LDAP with the synchronization worker for Sidekiq:
+
+1. Edit `/etc/gitlab/gitlab.rb`:
+
+ ```ruby
+ gitlab_rails['ldap_enabled'] = true
+ gitlab_rails['prevent_ldap_sign_in'] = false
+ gitlab_rails['ldap_servers'] = {
+ 'main' => {
+ 'label' => 'LDAP',
+ 'host' => 'ldap.mydomain.com',
+ 'port' => 389,
+ 'uid' => 'sAMAccountName',
+ 'encryption' => 'simple_tls',
+ 'verify_certificates' => true,
+ 'bind_dn' => '_the_full_dn_of_the_user_you_will_bind_with',
+ 'password' => '_the_password_of_the_bind_user',
+ 'tls_options' => {
+ 'ca_file' => '',
+ 'ssl_version' => '',
+ 'ciphers' => '',
+ 'cert' => '',
+ 'key' => ''
+ },
+ 'timeout' => 10,
+ 'active_directory' => true,
+ 'allow_username_or_email_login' => false,
+ 'block_auto_created_users' => false,
+ 'base' => 'dc=example,dc=com',
+ 'user_filter' => '',
+ 'attributes' => {
+ 'username' => ['uid', 'userid', 'sAMAccountName'],
+ 'email' => ['mail', 'email', 'userPrincipalName'],
+ 'name' => 'cn',
+ 'first_name' => 'givenName',
+ 'last_name' => 'sn'
+ },
+ 'lowercase_usernames' => false,
+
+ # Enterprise Edition only
+ # https://docs.gitlab.com/ee/administration/auth/ldap/ldap_synchronization.html
+ 'group_base' => '',
+ 'admin_group' => '',
+ 'external_groups' => [],
+ 'sync_ssh_keys' => false
+ }
+ }
+ gitlab_rails['ldap_sync_worker_cron'] = "0 */12 * * *"
+ ```
+
+1. Reconfigure GitLab:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+## Disable Rugged
+
+Calls into Rugged, Ruby bindings for `libgit2`, [lock the Sidekiq processes's GVL](https://silverhammermba.github.io/emberb/c/#c-in-ruby-threads),
+blocking all jobs on that worker from proceeding. If Rugged calls performed by Sidekiq are slow, this can cause significant delays in
+background task processing.
+
+By default, Rugged is used when Git repository data is stored on local storage or on an NFS mount.
+[Using Rugged is recommended when using NFS](../nfs.md#improving-nfs-performance-with-gitlab), but if
+you are using local storage, disabling Rugged can improve Sidekiq performance:
+
+```shell
+sudo gitlab-rake gitlab:features:disable_rugged
+```
+
+## Related topics
+
+- [Extra Sidekiq processes](extra_sidekiq_processes.md)
+- [Extra Sidekiq routing](extra_sidekiq_routing.md)
+- [Sidekiq health checks](sidekiq_health_check.md)
+- [Using the GitLab-Sidekiq chart](https://docs.gitlab.com/charts/charts/gitlab/sidekiq/)
+
+## Troubleshooting
+
+See our [administrator guide to troubleshooting Sidekiq](sidekiq_troubleshooting.md).
diff --git a/doc/administration/sidekiq/sidekiq_health_check.md b/doc/administration/sidekiq/sidekiq_health_check.md
new file mode 100644
index 00000000000..3477320a2ac
--- /dev/null
+++ b/doc/administration/sidekiq/sidekiq_health_check.md
@@ -0,0 +1,58 @@
+---
+stage: Systems
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Sidekiq Health Check **(FREE SELF)**
+
+GitLab provides liveness and readiness probes to indicate service health and
+reachability to the Sidekiq cluster. These endpoints
+[can be provided to schedulers like Kubernetes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)
+to hold traffic until the system is ready or restart the container as needed.
+
+The health check server can be set up when [configuring Sidekiq](index.md).
+
+## Readiness
+
+The readiness probe checks whether the Sidekiq workers are ready to process jobs.
+
+```plaintext
+GET /readiness
+```
+
+If the server is bound to `localhost:8092`, the process cluster can be probed for readiness as follows:
+
+```shell
+curl "http://localhost:8092/readiness"
+```
+
+On success, the endpoint returns a `200` HTTP status code, and a response like the following:
+
+```json
+{
+ "status": "ok"
+}
+```
+
+## Liveness
+
+Checks whether the Sidekiq cluster is running.
+
+```plaintext
+GET /liveness
+```
+
+If the server is bound to `localhost:8092`, the process cluster can be probed for liveness as follows:
+
+```shell
+curl "http://localhost:8092/liveness"
+```
+
+On success, the endpoint returns a `200` HTTP status code, and a response like the following:
+
+```json
+{
+ "status": "ok"
+}
+```
diff --git a/doc/administration/sidekiq/sidekiq_job_migration.md b/doc/administration/sidekiq/sidekiq_job_migration.md
new file mode 100644
index 00000000000..a3bc8b2959a
--- /dev/null
+++ b/doc/administration/sidekiq/sidekiq_job_migration.md
@@ -0,0 +1,40 @@
+---
+stage: none
+group: unassigned
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Sidekiq job migration **(FREE SELF)**
+
+WARNING:
+This operation should be very uncommon. We do not recommend it for the vast majority of GitLab instances.
+
+Sidekiq routing rules allow administrators to re-route certain background jobs from their regular queue to an alternative queue. By default, GitLab uses one queue per background job type. GitLab has over 400 background job types, and so correspondingly it has over 400 queues.
+
+Most administrators do not need to change this setting. In some cases with particularly large background job processing workloads, Redis performance may suffer due to the number of queues that GitLab listens to.
+
+If the Sidekiq routing rules are changed, administrators need to take care with the migration to avoid losing jobs entirely. The basic migration steps are:
+
+1. Listen to both the old and new queues.
+1. Update the routing rules.
+1. Wait until there are no publishers dispatching jobs to the old queues.
+1. Run the [Rake tasks for future jobs](#future-jobs).
+1. Wait for the old queues to be empty.
+1. Stop listening to the old queues.
+
+## Future jobs
+
+Step 4 involves rewriting some Sidekiq job data for jobs that are already stored in Redis, but due to run in future. There are two sets of jobs to run in future: scheduled jobs and jobs to be retried. We provide a separate Rake task to migrate each set:
+
+- `gitlab:sidekiq:migrate_jobs:retry` for jobs to be retried.
+- `gitlab:sidekiq:migrate_jobs:scheduled` for scheduled jobs.
+
+Most of the time, running both at the same time is the correct choice. There are two separate tasks to allow for more fine-grained control where needed. To run both at once:
+
+```shell
+# omnibus-gitlab
+sudo gitlab-rake gitlab:sidekiq:migrate_jobs:retry gitlab:sidekiq:migrate_jobs:schedule
+
+# source installations
+bundle exec rake gitlab:sidekiq:migrate_jobs:retry gitlab:sidekiq:migrate_jobs:schedule RAILS_ENV=production
+```
diff --git a/doc/administration/sidekiq/sidekiq_memory_killer.md b/doc/administration/sidekiq/sidekiq_memory_killer.md
new file mode 100644
index 00000000000..12381808523
--- /dev/null
+++ b/doc/administration/sidekiq/sidekiq_memory_killer.md
@@ -0,0 +1,82 @@
+---
+stage: Data Stores
+group: Memory
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Sidekiq MemoryKiller **(FREE SELF)**
+
+The GitLab Rails application code suffers from memory leaks. For web requests
+this problem is made manageable using
+[`puma-worker-killer`](https://github.com/schneems/puma_worker_killer) which
+restarts Puma worker processes if it exceeds a memory limit. The Sidekiq
+MemoryKiller applies the same approach to the Sidekiq processes used by GitLab
+to process background jobs.
+
+Unlike puma-worker-killer, which is enabled by default for all GitLab
+installations of GitLab 13.0 and later, the Sidekiq MemoryKiller is enabled by default
+_only_ for Omnibus packages. The reason for this is that the MemoryKiller
+relies on runit to restart Sidekiq after a memory-induced shutdown and GitLab
+installations from source do not all use runit or an equivalent.
+
+With the default settings, the MemoryKiller causes a Sidekiq restart no
+more often than once every 15 minutes, with the restart causing about one
+minute of delay for incoming background jobs.
+
+Some background jobs rely on long-running external processes. To ensure these
+are cleanly terminated when Sidekiq is restarted, each Sidekiq process should be
+run as a process group leader (for example, using `chpst -P`). If using Omnibus or the
+`bin/background_jobs` script with `runit` installed, this is handled for you.
+
+## Configuring the MemoryKiller
+
+The MemoryKiller is controlled using environment variables.
+
+- `SIDEKIQ_DAEMON_MEMORY_KILLER`: defaults to 1. When set to 0, the MemoryKiller
+ works in _legacy_ mode. Otherwise, the MemoryKiller works in _daemon_ mode.
+
+ In _legacy_ mode, the MemoryKiller checks the Sidekiq process RSS
+ ([Resident Set Size](https://github.com/mperham/sidekiq/wiki/Memory#rss))
+ after each job.
+
+ In _daemon_ mode, the MemoryKiller checks the Sidekiq process RSS every 3 seconds
+ (defined by `SIDEKIQ_MEMORY_KILLER_CHECK_INTERVAL`).
+
+- `SIDEKIQ_MEMORY_KILLER_MAX_RSS` (KB): if this variable is set, and its value is greater
+ than 0, the MemoryKiller is enabled. Otherwise the MemoryKiller is disabled.
+
+ `SIDEKIQ_MEMORY_KILLER_MAX_RSS` defines the Sidekiq process allowed RSS.
+
+ In _legacy_ mode, if the Sidekiq process exceeds the allowed RSS then an irreversible
+ delayed graceful restart is triggered. The restart of Sidekiq happens
+ after `SIDEKIQ_MEMORY_KILLER_GRACE_TIME` seconds.
+
+ In _daemon_ mode, if the Sidekiq process exceeds the allowed RSS for longer than
+ `SIDEKIQ_MEMORY_KILLER_GRACE_TIME` the graceful restart is triggered. If the
+ Sidekiq process go below the allowed RSS within `SIDEKIQ_MEMORY_KILLER_GRACE_TIME`,
+ the restart is aborted.
+
+ The default value for Omnibus packages is set
+ [in the Omnibus GitLab repository](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-cookbooks/gitlab/attributes/default.rb).
+
+- `SIDEKIQ_MEMORY_KILLER_HARD_LIMIT_RSS` (KB): is used by _daemon_ mode. If the Sidekiq
+ process RSS (expressed in kilobytes) exceeds `SIDEKIQ_MEMORY_KILLER_HARD_LIMIT_RSS`,
+ an immediate graceful restart of Sidekiq is triggered.
+
+- `SIDEKIQ_MEMORY_KILLER_CHECK_INTERVAL`: used in _daemon_ mode to define how
+ often to check process RSS, default to 3 seconds.
+
+- `SIDEKIQ_MEMORY_KILLER_GRACE_TIME`: defaults to 900 seconds (15 minutes).
+ The usage of this variable is described as part of `SIDEKIQ_MEMORY_KILLER_MAX_RSS`.
+
+- `SIDEKIQ_MEMORY_KILLER_SHUTDOWN_WAIT`: defaults to 30 seconds. This defines the
+ maximum time allowed for all Sidekiq jobs to finish. No new jobs are accepted
+ during that time, and the process exits as soon as all jobs finish.
+
+ If jobs do not finish during that time, the MemoryKiller interrupts all currently
+ running jobs by sending `SIGTERM` to the Sidekiq process.
+
+ If the process hard shutdown/restart is not performed by Sidekiq,
+ the Sidekiq process is forcefully terminated after
+ `Sidekiq.options[:timeout] + 2` seconds. An external supervision mechanism
+ (for example, runit) must restart Sidekiq afterwards.
diff --git a/doc/administration/sidekiq/sidekiq_troubleshooting.md b/doc/administration/sidekiq/sidekiq_troubleshooting.md
new file mode 100644
index 00000000000..9bc8e473409
--- /dev/null
+++ b/doc/administration/sidekiq/sidekiq_troubleshooting.md
@@ -0,0 +1,381 @@
+---
+stage: Systems
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Troubleshooting Sidekiq **(FREE SELF)**
+
+Sidekiq is the background job processor GitLab uses to asynchronously run
+tasks. When things go wrong it can be difficult to troubleshoot. These
+situations also tend to be high-pressure because a production system job queue
+may be filling up. Users will notice when this happens because new branches
+may not show up and merge requests may not be updated. The following are some
+troubleshooting steps to help you diagnose the bottleneck.
+
+GitLab administrators/users should consider working through these
+debug steps with GitLab Support so the backtraces can be analyzed by our team.
+It may reveal a bug or necessary improvement in GitLab.
+
+In any of the backtraces, be wary of suspecting cases where every
+thread appears to be waiting in the database, Redis, or waiting to acquire
+a mutex. This **may** mean there's contention in the database, for example,
+but look for one thread that is different than the rest. This other thread
+may be using all available CPU, or have a Ruby Global Interpreter Lock,
+preventing other threads from continuing.
+
+## Log arguments to Sidekiq jobs
+
+[In GitLab 13.6 and later](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/44853)
+some arguments passed to Sidekiq jobs are logged by default.
+To avoid logging sensitive information (for instance, password reset tokens),
+GitLab logs numeric arguments for all workers, with overrides for some specific
+workers where their arguments are not sensitive.
+
+Example log output:
+
+```json
+{"severity":"INFO","time":"2020-06-08T14:37:37.892Z","class":"AdminEmailsWorker","args":["[FILTERED]","[FILTERED]","[FILTERED]"],"retry":3,"queue":"admin_emails","backtrace":true,"jid":"9e35e2674ac7b12d123e13cc","created_at":"2020-06-08T14:37:37.373Z","meta.user":"root","meta.caller_id":"Admin::EmailsController#create","correlation_id":"37D3lArJmT1","uber-trace-id":"2d942cc98cc1b561:6dc94409cfdd4d77:9fbe19bdee865293:1","enqueued_at":"2020-06-08T14:37:37.410Z","pid":65011,"message":"AdminEmailsWorker JID-9e35e2674ac7b12d123e13cc: done: 0.48085 sec","job_status":"done","scheduling_latency_s":0.001012,"redis_calls":9,"redis_duration_s":0.004608,"redis_read_bytes":696,"redis_write_bytes":6141,"duration_s":0.48085,"cpu_s":0.308849,"completed_at":"2020-06-08T14:37:37.892Z","db_duration_s":0.010742}
+{"severity":"INFO","time":"2020-06-08T14:37:37.894Z","class":"ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper","wrapped":"ActionMailer::MailDeliveryJob","queue":"mailers","args":["[FILTERED]"],"retry":3,"backtrace":true,"jid":"e47a4f6793d475378432e3c8","created_at":"2020-06-08T14:37:37.884Z","meta.user":"root","meta.caller_id":"AdminEmailsWorker","correlation_id":"37D3lArJmT1","uber-trace-id":"2d942cc98cc1b561:29344de0f966446d:5c3b0e0e1bef987b:1","enqueued_at":"2020-06-08T14:37:37.885Z","pid":65011,"message":"ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper JID-e47a4f6793d475378432e3c8: start","job_status":"start","scheduling_latency_s":0.009473}
+{"severity":"INFO","time":"2020-06-08T14:39:50.648Z","class":"NewIssueWorker","args":["455","1"],"retry":3,"queue":"new_issue","backtrace":true,"jid":"a24af71f96fd129ec47f5d1e","created_at":"2020-06-08T14:39:50.643Z","meta.user":"root","meta.project":"h5bp/html5-boilerplate","meta.root_namespace":"h5bp","meta.caller_id":"Projects::IssuesController#create","correlation_id":"f9UCZHqhuP7","uber-trace-id":"28f65730f99f55a3:a5d2b62dec38dffc:48ddd092707fa1b7:1","enqueued_at":"2020-06-08T14:39:50.646Z","pid":65011,"message":"NewIssueWorker JID-a24af71f96fd129ec47f5d1e: start","job_status":"start","scheduling_latency_s":0.001144}
+```
+
+When using [Sidekiq JSON logging](../logs/index.md#sidekiqlog),
+arguments logs are limited to a maximum size of 10 kilobytes of text;
+any arguments after this limit are discarded and replaced with a
+single argument containing the string `"..."`.
+
+You can set `SIDEKIQ_LOG_ARGUMENTS` [environment variable](https://docs.gitlab.com/omnibus/settings/environment-variables.html)
+to `0` (false) to disable argument logging.
+
+Example:
+
+```ruby
+gitlab_rails['env'] = {"SIDEKIQ_LOG_ARGUMENTS" => "0"}
+```
+
+In GitLab 13.5 and earlier, set `SIDEKIQ_LOG_ARGUMENTS` to `1` to start logging arguments passed to Sidekiq.
+
+## Thread dump
+
+Send the Sidekiq process ID the `TTIN` signal to output thread
+backtraces in the log file.
+
+```shell
+kill -TTIN <sidekiq_pid>
+```
+
+Check in `/var/log/gitlab/sidekiq/current` or `$GITLAB_HOME/log/sidekiq.log` for
+the backtrace output. The backtraces are lengthy and generally start with
+several `WARN` level messages. Here's an example of a single thread's backtrace:
+
+```plaintext
+2016-04-13T06:21:20.022Z 31517 TID-orn4urby0 WARN: ActiveRecord::RecordNotFound: Couldn't find Note with 'id'=3375386
+2016-04-13T06:21:20.022Z 31517 TID-orn4urby0 WARN: /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/activerecord-4.2.5.2/lib/active_record/core.rb:155:in `find'
+/opt/gitlab/embedded/service/gitlab-rails/app/workers/new_note_worker.rb:7:in `perform'
+/opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:150:in `execute_job'
+/opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:132:in `block (2 levels) in process'
+/opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/chain.rb:127:in `block in invoke'
+/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/sidekiq_middleware/memory_killer.rb:17:in `call'
+/opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'
+/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/sidekiq_middleware/arguments_logger.rb:6:in `call'
+...
+```
+
+In some cases Sidekiq may be hung and unable to respond to the `TTIN` signal.
+Move on to other troubleshooting methods if this happens.
+
+## Ruby profiling with `rbspy`
+
+[rbspy](https://rbspy.github.io) is an easy to use and low-overhead Ruby profiler that can be used to create
+flamegraph-style diagrams of CPU usage by Ruby processes.
+
+No changes to GitLab are required to use it and it has no dependencies. To install it:
+
+1. Download the binary from the [`rbspy` releases page](https://github.com/rbspy/rbspy/releases).
+1. Make the binary executable.
+
+To profile a Sidekiq worker for one minute, run:
+
+```shell
+sudo ./rbspy record --pid <sidekiq_pid> --duration 60 --file /tmp/sidekiq_profile.svg
+```
+
+![Example rbspy flamegraph](img/sidekiq_flamegraph.png)
+
+In this example of a flamegraph generated by `rbspy`, almost all of the Sidekiq process's time is spent in `rev_parse`, a native C
+function in Rugged. In the stack, we can see `rev_parse` is being called by the `ExpirePipelineCacheWorker`.
+
+## Process profiling with `perf`
+
+Linux has a process profiling tool called `perf` that is helpful when a certain
+process is eating up a lot of CPU. If you see high CPU usage and Sidekiq isn't
+responding to the `TTIN` signal, this is a good next step.
+
+If `perf` is not installed on your system, install it with `apt-get` or `yum`:
+
+```shell
+# Debian
+sudo apt-get install linux-tools
+
+# Ubuntu (may require these additional Kernel packages)
+sudo apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r`
+
+# Red Hat/CentOS
+sudo yum install perf
+```
+
+Run `perf` against the Sidekiq PID:
+
+```shell
+sudo perf record -p <sidekiq_pid>
+```
+
+Let this run for 30-60 seconds and then press Ctrl-C. Then view the `perf` report:
+
+```shell
+$ sudo perf report
+
+# Sample output
+Samples: 348K of event 'cycles', Event count (approx.): 280908431073
+ 97.69% ruby nokogiri.so [.] xmlXPathNodeSetMergeAndClear
+ 0.18% ruby libruby.so.2.1.0 [.] objspace_malloc_increase
+ 0.12% ruby libc-2.12.so [.] _int_malloc
+ 0.10% ruby libc-2.12.so [.] _int_free
+```
+
+Above you see sample output from a `perf` report. It shows that 97% of the CPU is
+being spent inside Nokogiri and `xmlXPathNodeSetMergeAndClear`. For something
+this obvious you should then go investigate what job in GitLab would use
+Nokogiri and XPath. Combine with `TTIN` or `gdb` output to show the
+corresponding Ruby code where this is happening.
+
+## The GNU Project Debugger (`gdb`)
+
+`gdb` can be another effective tool for debugging Sidekiq. It gives you a little
+more interactive way to look at each thread and see what's causing problems.
+
+Attaching to a process with `gdb` suspends the normal operation
+of the process (Sidekiq does not process jobs while `gdb` is attached).
+
+Start by attaching to the Sidekiq PID:
+
+```shell
+gdb -p <sidekiq_pid>
+```
+
+Then gather information on all the threads:
+
+```plaintext
+info threads
+
+# Example output
+30 Thread 0x7fe5fbd63700 (LWP 26060) 0x0000003f7cadf113 in poll () from /lib64/libc.so.6
+29 Thread 0x7fe5f2b3b700 (LWP 26533) 0x0000003f7ce0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
+28 Thread 0x7fe5f2a3a700 (LWP 26534) 0x0000003f7ce0ba5e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
+27 Thread 0x7fe5f2939700 (LWP 26535) 0x0000003f7ce0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
+26 Thread 0x7fe5f2838700 (LWP 26537) 0x0000003f7ce0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
+25 Thread 0x7fe5f2737700 (LWP 26538) 0x0000003f7ce0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
+24 Thread 0x7fe5f2535700 (LWP 26540) 0x0000003f7ce0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
+23 Thread 0x7fe5f2434700 (LWP 26541) 0x0000003f7ce0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
+22 Thread 0x7fe5f2232700 (LWP 26543) 0x0000003f7ce0b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
+21 Thread 0x7fe5f2131700 (LWP 26544) 0x00007fe5f7b570f0 in xmlXPathNodeSetMergeAndClear ()
+from /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/nokogiri-1.6.7.2/lib/nokogiri/nokogiri.so
+...
+```
+
+If you see a suspicious thread, like the Nokogiri one above, you may want
+to get more information:
+
+```plaintext
+thread 21
+bt
+
+# Example output
+#0 0x00007ff0d6afe111 in xmlXPathNodeSetMergeAndClear () from /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/nokogiri-1.6.7.2/lib/nokogiri/nokogiri.so
+#1 0x00007ff0d6b0b836 in xmlXPathNodeCollectAndTest () from /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/nokogiri-1.6.7.2/lib/nokogiri/nokogiri.so
+#2 0x00007ff0d6b09037 in xmlXPathCompOpEval () from /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/nokogiri-1.6.7.2/lib/nokogiri/nokogiri.so
+#3 0x00007ff0d6b09017 in xmlXPathCompOpEval () from /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/nokogiri-1.6.7.2/lib/nokogiri/nokogiri.so
+#4 0x00007ff0d6b092e0 in xmlXPathCompOpEval () from /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/nokogiri-1.6.7.2/lib/nokogiri/nokogiri.so
+#5 0x00007ff0d6b0bc37 in xmlXPathRunEval () from /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/nokogiri-1.6.7.2/lib/nokogiri/nokogiri.so
+#6 0x00007ff0d6b0be5f in xmlXPathEvalExpression () from /opt/gitlab/embedded/service/gem/ruby/2.1.0/gems/nokogiri-1.6.7.2/lib/nokogiri/nokogiri.so
+#7 0x00007ff0d6a97dc3 in evaluate (argc=2, argv=0x1022d058, self=<value optimized out>) at xml_xpath_context.c:221
+#8 0x00007ff0daeab0ea in vm_call_cfunc_with_frame (th=0x1022a4f0, reg_cfp=0x1032b810, ci=<value optimized out>) at vm_insnhelper.c:1510
+```
+
+To output a backtrace from all threads at once:
+
+```plaintext
+set pagination off
+thread apply all bt
+```
+
+Once you're done debugging with `gdb`, be sure to detach from the process and
+exit:
+
+```plaintext
+detach
+exit
+```
+
+## Sidekiq kill signals
+
+TTIN was described above as the signal to print backtraces for logging, however
+Sidekiq responds to other signals as well. For example, TSTP and TERM can be used
+to gracefully shut Sidekiq down, see
+[the Sidekiq Signals docs](https://github.com/mperham/sidekiq/wiki/Signals#ttin).
+
+## Check for blocking queries
+
+Sometimes the speed at which Sidekiq processes jobs can be so fast that it can
+cause database contention. Check for blocking queries when backtraces above
+show that many threads are stuck in the database adapter.
+
+The PostgreSQL wiki has details on the query you can run to see blocking
+queries. The query is different based on PostgreSQL version. See
+[Lock Monitoring](https://wiki.postgresql.org/wiki/Lock_Monitoring) for
+the query details.
+
+## Managing Sidekiq queues
+
+It is possible to use [Sidekiq API](https://github.com/mperham/sidekiq/wiki/API)
+to perform a number of troubleshooting steps on Sidekiq.
+
+These are the administrative commands and it should only be used if currently
+administration interface is not suitable due to scale of installation.
+
+All these commands should be run using `gitlab-rails console`.
+
+### View the queue size
+
+```ruby
+Sidekiq::Queue.new("pipeline_processing:build_queue").size
+```
+
+### Enumerate all enqueued jobs
+
+```ruby
+queue = Sidekiq::Queue.new("chaos:chaos_sleep")
+queue.each do |job|
+ # job.klass # => 'MyWorker'
+ # job.args # => [1, 2, 3]
+ # job.jid # => jid
+ # job.queue # => chaos:chaos_sleep
+ # job["retry"] # => 3
+ # job.item # => {
+ # "class"=>"Chaos::SleepWorker",
+ # "args"=>[1000],
+ # "retry"=>3,
+ # "queue"=>"chaos:chaos_sleep",
+ # "backtrace"=>true,
+ # "queue_namespace"=>"chaos",
+ # "jid"=>"39bc482b823cceaf07213523",
+ # "created_at"=>1566317076.266069,
+ # "correlation_id"=>"c323b832-a857-4858-b695-672de6f0e1af",
+ # "enqueued_at"=>1566317076.26761},
+ # }
+
+ # job.delete if job.jid == 'abcdef1234567890'
+end
+```
+
+### Enumerate currently running jobs
+
+```ruby
+workers = Sidekiq::Workers.new
+workers.each do |process_id, thread_id, work|
+ # process_id is a unique identifier per Sidekiq process
+ # thread_id is a unique identifier per thread
+ # work is a Hash which looks like:
+ # {"queue"=>"chaos:chaos_sleep",
+ # "payload"=>
+ # { "class"=>"Chaos::SleepWorker",
+ # "args"=>[1000],
+ # "retry"=>3,
+ # "queue"=>"chaos:chaos_sleep",
+ # "backtrace"=>true,
+ # "queue_namespace"=>"chaos",
+ # "jid"=>"b2a31e3eac7b1a99ff235869",
+ # "created_at"=>1566316974.9215662,
+ # "correlation_id"=>"e484fb26-7576-45f9-bf21-b99389e1c53c",
+ # "enqueued_at"=>1566316974.9229589},
+ # "run_at"=>1566316974}],
+end
+```
+
+### Remove Sidekiq jobs for given parameters (destructive)
+
+The general method to kill jobs conditionally is the following command, which
+removes jobs that are queued but not started. Running jobs can not be killed.
+
+```ruby
+queue = Sidekiq::Queue.new('<queue name>')
+queue.each { |job| job.delete if <condition>}
+```
+
+Have a look at the section below for cancelling running jobs.
+
+In the method above, `<queue-name>` is the name of the queue that contains the jobs you want to delete and `<condition>` decides which jobs get deleted.
+
+Commonly, `<condition>` references the job arguments, which depend on the type of job in question. To find the arguments for a specific queue, you can have a look at the `perform` function of the related worker file, commonly found at `/app/workers/<queue-name>_worker.rb`.
+
+For example, `repository_import` has `project_id` as the job argument, while `update_merge_requests` has `project_id, user_id, oldrev, newrev, ref`.
+
+Arguments need to be referenced by their sequence ID using `job.args[<id>]` because `job.args` is a list of all arguments provided to the Sidekiq job.
+
+Here are some examples:
+
+```ruby
+queue = Sidekiq::Queue.new('update_merge_requests')
+# In this example, we want to remove any update_merge_requests jobs
+# for the Project with ID 125 and ref `ref/heads/my_branch`
+queue.each { |job| job.delete if job.args[0] == 125 and job.args[4] == 'ref/heads/my_branch' }
+```
+
+```ruby
+# Cancelling jobs like: `RepositoryImportWorker.new.perform_async(100)`
+id_list = [100]
+
+queue = Sidekiq::Queue.new('repository_import')
+queue.each do |job|
+ job.delete if id_list.include?(job.args[0])
+end
+```
+
+### Remove specific job ID (destructive)
+
+```ruby
+queue = Sidekiq::Queue.new('repository_import')
+queue.each do |job|
+ job.delete if job.jid == 'my-job-id'
+end
+```
+
+## Canceling running jobs (destructive)
+
+> Introduced in GitLab 12.3.
+
+This is highly risky operation and use it as last resort.
+Doing that might result in data corruption, as the job
+is interrupted mid-execution and it is not guaranteed
+that proper rollback of transactions is implemented.
+
+```ruby
+Gitlab::SidekiqDaemon::Monitor.cancel_job('job-id')
+```
+
+> This requires the Sidekiq to be run with `SIDEKIQ_MONITOR_WORKER=1`
+> environment variable.
+
+To perform of the interrupt we use `Thread.raise` which
+has number of drawbacks, as mentioned in [Why Ruby's Timeout is dangerous (and Thread.raise is terrifying)](https://jvns.ca/blog/2015/11/27/why-rubys-timeout-is-dangerous-and-thread-dot-raise-is-terrifying/):
+
+> This is where the implications get interesting, and terrifying. This means that an exception can get raised:
+>
+> - during a network request (ok, as long as the surrounding code is prepared to catch Timeout::Error)
+> - during the cleanup for the network request
+> - during a rescue block
+> - while creating an object to save to the database afterwards
+> - in any of your code, regardless of whether it could have possibly raised an exception before
+>
+> Nobody writes code to defend against an exception being raised on literally any line. That's not even possible. So Thread.raise is basically like a sneak attack on your code that could result in almost anything. It would probably be okay if it were pure-functional code that did not modify any state. But this is Ruby, so that's unlikely :)