1 files changed, 155 insertions, 0 deletions
diff --git a/doc/development/sidekiq/logging.md b/doc/development/sidekiq/logging.md
new file mode 100644
index 00000000000..015376b0fc6
--- /dev/null
+++ b/doc/development/sidekiq/logging.md
@@ -0,0 +1,155 @@
+---
+stage: none
+group: unassigned
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Sidekiq logging
+
+## Worker context
+
+> [Introduced](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/9) in GitLab 12.8.
+
+To have some more information about workers in the logs, we add
+[metadata to the jobs in the form of an
+`ApplicationContext`](../logging.md#logging-context-metadata-through-rails-or-grape-requests).
+In most cases, when scheduling a job from a request, this context is already
+deducted from the request and added to the scheduled job.
+
+When a job runs, the context that was active when it was scheduled
+is restored. This causes the context to be propagated to any job
+scheduled from within the running job.
+
+All this means that in most cases, to add context to jobs, we don't
+need to do anything.
+
+There are however some instances when there would be no context
+present when the job is scheduled, or the context that is present is
+likely to be incorrect. For these instances, we've added Rubocop rules
+to draw attention and avoid incorrect metadata in our logs.
+
+As with most our cops, there are perfectly valid reasons for disabling
+them. In this case it could be that the context from the request is
+correct. Or maybe you've specified a context already in a way that
+isn't picked up by the cops. In any case, leave a code comment
+pointing to which context to use when disabling the cops.
+
+When you do provide objects to the context, make sure that the
+route for namespaces and projects is pre-loaded. This can be done by using
+the `.with_route` scope defined on all `Routable`s.
+
+### Cron workers
+
+The context is automatically cleared for workers in the cronjob queue
+(`include CronjobQueue`), even when scheduling them from
+requests. We do this to avoid incorrect metadata when other jobs are
+scheduled from the cron worker.
+
+Cron workers themselves run instance wide, so they aren't scoped to
+users, namespaces, projects, or other resources that should be added to
+the context.
+
+However, they often schedule other jobs that _do_ require context.
+
+That is why there needs to be an indication of context somewhere in
+the worker. This can be done by using one of the following methods
+somewhere within the worker:
+
+1. Wrap the code that schedules jobs in the `with_context` helper:
+
+   ```ruby
+     def perform
+       deletion_cutoff = Gitlab::CurrentSettings
+                           .deletion_adjourned_period.days.ago.to_date
+       projects = Project.with_route.with_namespace
+                    .aimed_for_deletion(deletion_cutoff)
+
+       projects.find_each(batch_size: 100).with_index do |project, index|
+         delay = index * INTERVAL
+
+         with_context(project: project) do
+           AdjournedProjectDeletionWorker.perform_in(delay, project.id)
+         end
+       end
+     end
+   ```
+
+1. Use the a batch scheduling method that provides context:
+
+   ```ruby
+     def schedule_projects_in_batch(projects)
+       ProjectImportScheduleWorker.bulk_perform_async_with_contexts(
+         projects,
+         arguments_proc: -> (project) { project.id },
+         context_proc: -> (project) { { project: project } }
+       )
+     end
+   ```
+
+   Or, when scheduling with delays:
+
+   ```ruby
+     diffs.each_batch(of: BATCH_SIZE) do |diffs, index|
+       DeleteDiffFilesWorker
+         .bulk_perform_in_with_contexts(index *  5.minutes,
+                                        diffs,
+                                        arguments_proc: -> (diff) { diff.id },
+                                        context_proc: -> (diff) { { project: diff.merge_request.target_project } })
+     end
+   ```
+
+### Jobs scheduled in bulk
+
+Often, when scheduling jobs in bulk, these jobs should have a separate
+context rather than the overarching context.
+
+If that is the case, `bulk_perform_async` can be replaced by the
+`bulk_perform_async_with_context` helper, and instead of
+`bulk_perform_in` use `bulk_perform_in_with_context`.
+
+For example:
+
+```ruby
+    ProjectImportScheduleWorker.bulk_perform_async_with_contexts(
+      projects,
+      arguments_proc: -> (project) { project.id },
+      context_proc: -> (project) { { project: project } }
+    )
+```
+
+Each object from the enumerable in the first argument is yielded into 2
+blocks:
+
+- The `arguments_proc` which needs to return the list of arguments the
+  job needs to be scheduled with.
+
+- The `context_proc` which needs to return a hash with the context
+  information for the job.
+
+## Arguments logging
+
+As of GitLab 13.6, Sidekiq job arguments are logged by default, unless [`SIDEKIQ_LOG_ARGUMENTS`](../../administration/troubleshooting/sidekiq.md#log-arguments-to-sidekiq-jobs)
+is disabled.
+
+By default, the only arguments logged are numeric arguments, because
+arguments of other types could contain sensitive information. To
+override this, use `loggable_arguments` inside a worker with the indexes
+of the arguments to be logged. (Numeric arguments do not need to be
+specified here.)
+
+For example:
+
+```ruby
+class MyWorker
+  include ApplicationWorker
+
+  loggable_arguments 1, 3
+
+  # object_id will be logged as it's numeric
+  # string_a will be logged due to the loggable_arguments call
+  # string_b will be filtered from logs
+  # string_c will be logged due to the loggable_arguments call
+  def perform(object_id, string_a, string_b, string_c)
+  end
+end
+```