diff options
Diffstat (limited to 'doc/administration/operations/extra_sidekiq_routing.md')
-rw-r--r-- | doc/administration/operations/extra_sidekiq_routing.md | 196 |
1 files changed, 7 insertions, 189 deletions
diff --git a/doc/administration/operations/extra_sidekiq_routing.md b/doc/administration/operations/extra_sidekiq_routing.md index a6ad3e62bb7..072b6f63537 100644 --- a/doc/administration/operations/extra_sidekiq_routing.md +++ b/doc/administration/operations/extra_sidekiq_routing.md @@ -1,193 +1,11 @@ --- -stage: Systems -group: Distribution -info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +redirect_to: '../sidekiq/extra_sidekiq_routing.md' +remove_date: '2022-11-11' --- -# Queue routing rules **(FREE SELF)** +This document was moved to [another location](../sidekiq/extra_sidekiq_routing.md). -When the number of Sidekiq jobs increases to a certain scale, the system faces -some scalability issues. One of them is that the length of the queue tends to get -longer. High-urgency jobs have to wait longer until other less urgent jobs -finish. This head-of-line blocking situation may eventually affect the -responsiveness of the system, especially critical actions. In another scenario, -the performance of some jobs is degraded due to other long running or CPU-intensive jobs -(computing or rendering ones) in the same machine. - -To counter the aforementioned issues, one effective solution is to split -Sidekiq jobs into different queues and assign machines handling each queue -exclusively. For example, all CPU-intensive jobs could be routed to the -`cpu-bound` queue and handled by a fleet of CPU optimized instances. The queue -topology differs between companies depending on the workloads and usage -patterns. Therefore, GitLab supports a flexible mechanism for the -administrator to route the jobs based on their characteristics. - -As an alternative to [Queue selector](extra_sidekiq_processes.md#queue-selector), which -configures Sidekiq cluster to listen to a specific set of workers or queues, -GitLab also supports routing a job from a worker to the desired queue when it -is scheduled. Sidekiq clients try to match a job against a configured list of -routing rules. Rules are evaluated from first to last, and as soon as we find a -match for a given worker we stop processing for that worker (first match wins). -If the worker doesn't match any rule, it falls back to the queue name generated -from the worker name. - -By default, if the routing rules are not configured (or denoted with an empty -array), all the jobs are routed to the queue generated from the worker name. - -## Example configuration - -In `/etc/gitlab/gitlab.rb`: - -```ruby -sidekiq['routing_rules'] = [ - # Do not re-route workers that require their own queue - ['tags=needs_own_queue', nil], - # Route all non-CPU-bound workers that are high urgency to `high-urgency` queue - ['resource_boundary!=cpu&urgency=high', 'high-urgency'], - # Route all database, gitaly and global search workers that are throttled to `throttled` queue - ['feature_category=database,gitaly,global_search&urgency=throttled', 'throttled'], - # Route all workers having contact with outside work to a `network-intenstive` queue - ['has_external_dependencies=true|feature_category=hooks|tags=network', 'network-intensive'], - # Route all import workers to the queues generated by the worker name, for - # example, JiraImportWorker to `jira_import`, SVNWorker to `svn_worker` - ['feature_category=import', nil], - # Wildcard matching, route the rest to `default` queue - ['*', 'default'] -] -``` - -The routing rules list is an order-matter array of tuples of query and -corresponding queue: - -- The query is following a [worker matching query](#worker-matching-query) syntax. -- The `<queue_name>` must be a valid Sidekiq queue name. If the queue name - is `nil`, or an empty string, the worker is routed to the queue generated - by the name of the worker instead. - -The query supports wildcard matching `*`, which matches all workers. As a -result, the wildcard query must stay at the end of the list or the rules after it -are ignored. - -NOTE: -Mixing queue routing rules and queue selectors requires care to -ensure all jobs that are scheduled and picked up by appropriate Sidekiq -workers. - -## Worker matching query - -GitLab provides a query syntax to match a worker based on its -attributes. This query syntax is employed by both [Queue routing -rules](#queue-routing-rules) and [Queue -selector](extra_sidekiq_processes.md#queue-selector). A query includes two -components: - -- Attributes that can be selected. -- Operators used to construct a query. - -### Available attributes - -> [Introduced](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/261) in GitLab 13.1 (`tags`). - -Queue matching query works upon the worker attributes, described in -[Sidekiq style guide](../../development/sidekiq/index.md). We support querying -based on a subset of worker attributes: - -- `feature_category` - the [GitLab feature - category](https://about.gitlab.com/direction/maturity/#category-maturity) the - queue belongs to. For example, the `merge` queue belongs to the - `source_code_management` category. -- `has_external_dependencies` - whether or not the queue connects to external - services. For example, all importers have this set to `true`. -- `urgency` - how important it is that this queue's jobs run - quickly. Can be `high`, `low`, or `throttled`. For example, the - `authorized_projects` queue is used to refresh user permissions, and - is `high` urgency. -- `worker_name` - the worker name. Use this attribute to select a specific worker. -- `name` - the queue name generated from the worker name. Use this attribute to select a specific queue. Because this is generated from - the worker name, it does not change based on the result of other routing - rules. -- `resource_boundary` - if the queue is bound by `cpu`, `memory`, or - `unknown`. For example, the `ProjectExportWorker` is memory bound as it has - to load data in memory before saving it for export. -- `tags` - short-lived annotations for queues. These are expected to frequently - change from release to release, and may be removed entirely. - -`has_external_dependencies` is a boolean attribute: only the exact -string `true` is considered true, and everything else is considered -false. - -`tags` is a set, which means that `=` checks for intersecting sets, and -`!=` checks for disjoint sets. For example, `tags=a,b` selects queues -that have tags `a`, `b`, or both. `tags!=a,b` selects queues that have -neither of those tags. - -The attributes of each worker are hard-coded in the source code. For -convenience, we generate a [list of all available attributes in -GitLab Community Edition](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/all_queues.yml) -and a [list of all available attributes in -GitLab Enterprise Edition](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/all_queues.yml). - -### Available operators - -`queue_selector` supports the following operators, listed from highest -to lowest precedence: - -- `|` - the logical `OR` operator. For example, `query_a|query_b` (where `query_a` - and `query_b` are queries made up of the other operators here) includes - queues that match either query. -- `&` - the logical `AND` operator. For example, `query_a&query_b` (where - `query_a` and `query_b` are queries made up of the other operators here) will - only include queues that match both queries. -- `!=` - the `NOT IN` operator. For example, `feature_category!=issue_tracking` - excludes all queues from the `issue_tracking` feature category. -- `=` - the `IN` operator. For example, `resource_boundary=cpu` includes all - queues that are CPU bound. -- `,` - the concatenate set operator. For example, - `feature_category=continuous_integration,pages` includes all queues from - either the `continuous_integration` category or the `pages` category. This - example is also possible using the OR operator, but allows greater brevity, as - well as being lower precedence. - -The operator precedence for this syntax is fixed: it's not possible to make `AND` -have higher precedence than `OR`. - -[In GitLab 12.9](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26594) and -later, as with the standard queue group syntax above, a single `*` as the -entire queue group selects all queues. - -### Migration - -After the Sidekiq routing rules are changed, administrators must take care -with the migration to avoid losing jobs entirely, especially in a system with -long queues of jobs. The migration can be done by following the migration steps -mentioned in [Sidekiq job -migration](../../raketasks/sidekiq_job_migration.md) - -### Workers that cannot be migrated - -Some workers cannot share a queue with other workers - typically because -they check the size of their own queue - and so must be excluded from -this process. We recommend excluding these from any further worker -routing by adding a rule to keep them in their own queue, for example: - -```ruby -sidekiq['routing_rules'] = [ - ['tags=needs_own_queue', nil], - # ... -] -``` - -These queues must also be included in at least one [Sidekiq -queue group](extra_sidekiq_processes.md#start-multiple-processes). - -The following table shows the workers that should have their own queue: - -| Worker name | Queue name | GitLab issue | -| --- | --- | --- | -| `EmailReceiverWorker` | `email_receiver` | [`gitlab-com/gl-infra/scalability#1263`](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/1263) | -| `ServiceDeskEmailReceiverWorker` | `service_desk_email_receiver` | [`gitlab-com/gl-infra/scalability#1263`](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/1263) | -| `ProjectImportScheduleWorker` | `project_import_schedule` | [`gitlab-org/gitlab#340630`](https://gitlab.com/gitlab-org/gitlab/-/issues/340630) | -| `HashedStorage::MigratorWorker` | `hashed_storage:hashed_storage_migrator` | [`gitlab-org/gitlab#340629`](https://gitlab.com/gitlab-org/gitlab/-/issues/340629) | -| `HashedStorage::ProjectMigrateWorker` | `hashed_storage:hashed_storage_project_migrate` | [`gitlab-org/gitlab#340629`](https://gitlab.com/gitlab-org/gitlab/-/issues/340629) | -| `HashedStorage::ProjectRollbackWorker` | `hashed_storage:hashed_storage_project_rollback` | [`gitlab-org/gitlab#340629`](https://gitlab.com/gitlab-org/gitlab/-/issues/340629) | -| `HashedStorage::RollbackerWorker` | `hashed_storage:hashed_storage_rollbacker` | [`gitlab-org/gitlab#340629`](https://gitlab.com/gitlab-org/gitlab/-/issues/340629) | +<!-- This redirect file can be deleted after <2022-11-11>. --> +<!-- Redirects that point to other docs in the same project expire in three months. --> +<!-- Redirects that point to docs in a different project or site (link is not relative and starts with `https:`) expire in one year. --> +<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/redirects.html --> |