Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/real_time.md')
-rw-r--r--doc/development/real_time.md97
1 files changed, 97 insertions, 0 deletions
diff --git a/doc/development/real_time.md b/doc/development/real_time.md
new file mode 100644
index 00000000000..df725a36a93
--- /dev/null
+++ b/doc/development/real_time.md
@@ -0,0 +1,97 @@
+---
+stage: Plan
+group: Project Management
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Real-Time Features
+
+This guide contains instructions on how to safely roll out new real-time
+features.
+
+Real-time features are implemented using GraphQL Subscriptions.
+[Developer documentation](api_graphql_styleguide.md#subscriptions) is available.
+
+WebSockets are a relatively new technology at GitLab, and supporting them at
+scale introduces some challenges. For that reason, new features should be rolled
+out using the instructions below.
+
+## Reuse an existing WebSocket connection
+
+Features reusing an existing connection incur minimal risk. Feature flag rollout
+is recommended in order to give more control to self-hosting customers. However,
+it is not necessary to roll out in percentages, or to estimate new connections for
+GitLab.com.
+
+## Introduce a new WebSocket connection
+
+Any change that introduces a WebSocket connection to part of the GitLab application
+incurs some scalability risk, both to nodes responsible for maintaining open
+connections and on downstream services; such as Redis and the primary database.
+
+### Estimate peak connections
+
+The first real-time feature to be fully enabled on GitLab.com was
+[real-time assignees](https://gitlab.com/gitlab-org/gitlab/-/issues/17589). By comparing
+peak throughput to the issue page against peak simultaneous WebSocket connections it is
+possible to crudely estimate that each 1 request per second adds
+approximately 4200 WebSocket connections.
+
+To understand the impact a new feature might have, sum the peak throughput (RPS)
+to the pages it originates from (`n`) and apply the formula:
+
+```ruby
+(n * 4200) / peak_active_connections
+```
+
+Current active connections are visible on
+[this Grafana chart](https://dashboards.gitlab.net/d/websockets-main/websockets-overview?viewPanel=1357460996&orgId=1).
+
+This calculation is crude, and should be revised as new features are
+deployed. It yields a rough estimate of the capacity that must be
+supported, as a proportion of existing capacity.
+
+### Graduated roll-out
+
+New capacity may need to be provisioned to support your changes, depending on
+current saturation and the proportion of new connections required. While
+Kubernetes makes this relatively easy in most cases, there remains a risk to
+downstream services.
+
+To mitigate this, ensure that the code establishing the new WebSocket connection
+is feature flagged and defaulted to `off`. A careful, percentage-based roll-out
+of the feature flag ensures that effects can be observed on the [WebSocket
+dashboard](https://dashboards.gitlab.net/d/websockets-main/websockets-overview?orgId=1)
+
+1. Create a
+ [feature flag roll-out](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.gitlab/issue_templates/Feature%20Flag%20Roll%20Out.md)
+ issue.
+1. Add the estimated new connections required under the **What are we expecting to happen** section.
+1. Copy in a member of the Plan and Scalability teams to estimate a percentage-based
+ roll-out plan.
+
+## Backward compatibility
+
+For the duration of the feature flag roll-out and indefinitely thereafter,
+real-time features must be backward-compatible, or at least degrade
+gracefully. Not all customers have Action Cable enabled, and further work
+needs to be done before Action Cable can be enabled by default.
+
+Making real-time a requirement represents a breaking change, so the next
+opportunity to do this is version 15.0.
+
+## Enable Real-Time by default
+
+Mounting the Action Cable library adds minimal memory footprint. However,
+serving WebSocket requests introduces additional memory requirements. For this
+reason, enabling Action Cable by default requires additional work; perhaps
+to reduce overall memory usage, including a known issue with Workhorse, but at
+least to revise Reference Architectures.
+
+## Real-time infrastructure on GitLab.com
+
+On GitLab.com, WebSocket connections are served from dedicated infrastructure,
+entirely separate from the regular Web fleet and deployed with Kubernetes. This
+limits risk to nodes handling requests but not to shared services. For more
+information on the WebSockets Kubernetes deployment see
+[this epic](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/355).