Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2023-02-20 16:49:51 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2023-02-20 16:49:51 +0300
commit71786ddc8e28fbd3cb3fcc4b3ff15e5962a1c82e (patch)
tree6a2d93ef3fb2d353bb7739e4b57e6541f51cdd71 /doc/development/sidekiq
parenta7253423e3403b8c08f8a161e5937e1488f5f407 (diff)
Add latest changes from gitlab-org/gitlab@15-9-stable-eev15.9.0-rc42
Diffstat (limited to 'doc/development/sidekiq')
-rw-r--r--doc/development/sidekiq/index.md34
-rw-r--r--doc/development/sidekiq/worker_attributes.md26
2 files changed, 39 insertions, 21 deletions
diff --git a/doc/development/sidekiq/index.md b/doc/development/sidekiq/index.md
index f4f98641d39..355f5a3b753 100644
--- a/doc/development/sidekiq/index.md
+++ b/doc/development/sidekiq/index.md
@@ -7,7 +7,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Sidekiq guides
We use [Sidekiq](https://github.com/mperham/sidekiq) as our background
-job processor. These guides are for writing jobs that will work well on
+job processor. These guides are for writing jobs that works well on
GitLab.com and be consistent with our existing worker classes. For
information on administering GitLab, see [configuring Sidekiq](../../administration/sidekiq/index.md).
@@ -74,7 +74,7 @@ A lower retry count may be applicable if any of the below apply:
1. The worker is not idempotent and running it multiple times could
leave the system in an inconsistent state. For example, a worker that
posts a system note and then performs an action: if the second step
- fails and the worker retries, the system note will be posted again.
+ fails and the worker retries, the system note is posted again.
1. The worker is a cronjob that runs frequently. For example, if a cron
job runs every hour, then we don't need to retry beyond an hour
because we don't need two of the same job running at once.
@@ -96,6 +96,24 @@ def perform
end
```
+## Failure handling
+
+Failures are typically handled by Sidekiq itself, which takes advantage of the inbuilt retry mechanism mentioned above. You should allow exceptions to be raised so that Sidekiq can reschedule the job.
+
+If you need to perform an action when a job fails after all of its retry attempts, add it to the `sidekiq_retries_exhausted` method.
+
+```ruby
+sidekiq_retries_exhausted do |msg, ex|
+ project = Project.find(msg['args'].first)
+ project.perform_a_rollback # handle the permanent failure
+end
+
+def perform(project_id)
+ project = Project.find(project_id)
+ project.some_action # throws an exception
+end
+```
+
## Sidekiq Queues
Previously, each worker had its own queue, which was automatically set based on the
@@ -113,8 +131,8 @@ gitlab:sidekiq:all_queues_yml:generate` to regenerate
`app/workers/all_queues.yml` or `ee/app/workers/all_queues.yml` so that
it can be picked up by
[`sidekiq-cluster`](../../administration/sidekiq/extra_sidekiq_processes.md)
-in installations that don't use routing rules. To learn more about potential changes,
-read [Use routing rules by default and deprecate queue selectors for self-managed](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/596).
+in installations that don't use routing rules. For more information about potential changes,
+see [epic 596](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/596).
Additionally, run
`bin/rake gitlab:sidekiq:sidekiq_queues_yml:generate` to regenerate
@@ -156,7 +174,7 @@ queues in a namespace (technically: all queues prefixed with the namespace name)
when a namespace is provided instead of a simple queue name in the `--queue`
(`-q`) option, or in the `:queues:` section in `config/sidekiq_queues.yml`.
-Note that adding a worker to an existing namespace should be done with care, as
+Adding a worker to an existing namespace should be done with care, as
the extra jobs take resources away from jobs from workers that were already
there, if the resources available to the Sidekiq process handling the namespace
are not adjusted appropriately.
@@ -195,9 +213,9 @@ can read the number or type of provided arguments.
GitLab stores Sidekiq jobs and their arguments in Redis. To avoid
excessive memory usage, we compress the arguments of Sidekiq jobs
-if their original size is bigger than 100KB.
+if their original size is bigger than 100 KB.
-After compression, if their size still exceeds 5MB, it raises an
+After compression, if their size still exceeds 5 MB, it raises an
[`ExceedLimitError`](https://gitlab.com/gitlab-org/gitlab/-/blob/f3dd89e5e510ea04b43ffdcb58587d8f78a8d77c/lib/gitlab/sidekiq_middleware/size_limiter/exceed_limit_error.rb#L8)
error when scheduling the job.
@@ -227,6 +245,6 @@ tests should be placed in `spec/workers`.
## Interacting with Sidekiq Redis and APIs
-The application should minimise interaction with of any `Sidekiq.redis` and Sidekiq [APIs](https://github.com/mperham/sidekiq/blob/main/lib/sidekiq/api.rb). Such interactions in generic application logic should be abstracted to a [Sidekiq middleware](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/sidekiq_middleware) for re-use across teams. By decoupling application logic from Sidekiq's datastore, it allows for greater freedom when horizontally scaling the GitLab background processing setup.
+The application should minimise interaction with of any `Sidekiq.redis` and Sidekiq [APIs](https://github.com/mperham/sidekiq/blob/main/lib/sidekiq/api.rb). Such interactions in generic application logic should be abstracted to a [Sidekiq middleware](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/sidekiq_middleware) for re-use across teams. By decoupling application logic from Sidekiq datastore, it allows for greater freedom when horizontally scaling the GitLab background processing setup.
Some exceptions to this rule would be migration-related logic or administration operations.
diff --git a/doc/development/sidekiq/worker_attributes.md b/doc/development/sidekiq/worker_attributes.md
index 4fcd8e33d5c..a3bfe5f27cc 100644
--- a/doc/development/sidekiq/worker_attributes.md
+++ b/doc/development/sidekiq/worker_attributes.md
@@ -37,7 +37,7 @@ end
### Latency sensitive jobs
If a large number of background jobs get scheduled at once, queueing of jobs may
-occur while jobs wait for a worker node to be become available. This is normal
+occur while jobs wait for a worker node to be become available. This is standard
and gives the system resilience by allowing it to gracefully handle spikes in
traffic. Some jobs, however, are more sensitive to latency than others.
@@ -79,7 +79,7 @@ On GitLab.com, we run Sidekiq in several
each of which represents a particular type of workload.
When changing a queue's urgency, or adding a new queue, we need to take
-into account the expected workload on the new shard. Note that, if we're
+into account the expected workload on the new shard. If we're
changing an existing queue, there is also an effect on the old shard,
but that always reduces work.
@@ -108,7 +108,7 @@ shard_consumption = shard_rps * shard_duration_avg
If we expect an increase of **less than 5%**, then no further action is needed.
-Otherwise, please ping `@gitlab-org/scalability` on the merge request and ask
+Otherwise, ping `@gitlab-org/scalability` on the merge request and ask
for a review.
## Jobs with External Dependencies
@@ -121,7 +121,7 @@ However, some jobs are dependent on external services to complete
successfully. Some examples include:
1. Jobs which call web-hooks configured by a user.
-1. Jobs which deploy an application to a k8s cluster configured by a user.
+1. Jobs which deploy an application to a Kubernetes cluster configured by a user.
These jobs have "external dependencies". This is important for the operation of
the background processing cluster in several ways:
@@ -179,8 +179,8 @@ performance.
Likewise, if a worker uses large amounts of memory, we can run these on a
bespoke low concurrency, high memory fleet.
-Note that memory-bound workers create heavy GC workloads, with pauses of
-10-50ms. This has an impact on the latency requirements for the
+Memory-bound workers create heavy GC workloads, with pauses of
+10-50 ms. This has an impact on the latency requirements for the
worker. For this reason, `memory` bound, `urgency :high` jobs are not
permitted and fail CI. In general, `memory` bound workers are
discouraged, and alternative approaches to processing the work should be
@@ -219,7 +219,7 @@ We use the following approach to determine whether a worker is CPU-bound:
- Divide `cpu_s` by `duration` to get the percentage time spend on-CPU.
- If this ratio exceeds 33%, the worker is considered CPU-bound and should be
annotated as such.
-- Note that these values should not be used over small sample sizes, but
+- These values should not be used over small sample sizes, but
rather over fairly large aggregates.
## Feature category
@@ -254,7 +254,7 @@ When setting this field, consider the following trade-off:
- Prefer read replicas to add relief to the primary, but increase the likelihood of stale reads that have to be retried.
To maintain the same behavior compared to before this field was introduced, set it to `:always`, so
-database operations will only target the primary. Reasons for having to do so include workers
+database operations only target the primary. Reasons for having to do so include workers
that mostly or exclusively perform writes, or workers that read their own writes and who might run
into data consistency issues should a stale record be read back from a replica. **Try to avoid
these scenarios, since `:always` should be considered the exception, not the rule.**
@@ -270,10 +270,10 @@ The difference is in what happens when there is still replication lag after the
switch over to the primary right away, whereas `delayed` workers fail fast and are retried once.
If they still encounter replication lag, they also switch to the primary instead.
**If your worker never performs any writes, it is strongly advised to apply one of these consistency settings,
-since it will never need to rely on the primary database node.**
+since it never needs to rely on the primary database node.**
The table below shows the `data_consistency` attribute and its values, ordered by the degree to which
-they prefer read replicas and will wait for replicas to catch up:
+they prefer read replicas and wait for replicas to catch up:
| **Data Consistency** | **Description** |
|--------------|-----------------------------|
@@ -300,14 +300,14 @@ end
The `feature_flag` property allows you to toggle a job's `data_consistency`,
which permits you to safely toggle load balancing capabilities for a specific job.
-When `feature_flag` is disabled, the job defaults to `:always`, which means that the job will always use the primary database.
+When `feature_flag` is disabled, the job defaults to `:always`, which means that the job always uses the primary database.
The `feature_flag` property does not allow the use of
[feature gates based on actors](../feature_flags/index.md).
This means that the feature flag cannot be toggled only for particular
projects, groups, or users, but instead, you can safely use [percentage of time rollout](../feature_flags/index.md).
-Note that since we check the feature flag on both Sidekiq client and server, rolling out a 10% of the time,
-will likely results in 1% (`0.1` `[from client]*0.1` `[from server]`) of effective jobs using replicas.
+Since we check the feature flag on both Sidekiq client and server, rolling out a 10% of the time,
+likely results in 1% (`0.1` `[from client]*0.1` `[from server]`) of effective jobs using replicas.
Example: