Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/internal_analytics/index.md')
-rw-r--r--doc/development/internal_analytics/index.md84
1 files changed, 81 insertions, 3 deletions
diff --git a/doc/development/internal_analytics/index.md b/doc/development/internal_analytics/index.md
index d24ecf5a99c..64b9c7af037 100644
--- a/doc/development/internal_analytics/index.md
+++ b/doc/development/internal_analytics/index.md
@@ -6,7 +6,85 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Internal analytics
-Learn how to instrument your features on GitLab using:
+The internal analytics system provides the ability to track user behavior and system status for a GitLab instance
+to inform customer success services and further product development.
-- [Service Ping](service_ping/index.md)
-- [Snowplow](snowplow/index.md)
+These doc pages provide guides and information on how to leverage internal analytics capabilities of GitLab
+when developing new features or instrumenting existing ones.
+
+## Fundamental concepts
+
+Events and metrics are the foundation of the internal analytics system.
+Understanding the difference between the two concepts is vital to using the system.
+
+### Event
+
+An event is a record of an action that happened within the GitLab instance.
+An example action would be a user interaction like visiting the issue page or hovering the mouse cursor over the top navigation search.
+Other actions can result from background system processing like scheduled pipeline succeeding or receiving API calls from 3rd party system.
+Not every action is tracked and thereby turned into a recorded event automatically.
+Instead, if an action helps draw out product insights and helps to make more educated business decisions, we can track an event when the action happens.
+The produced event record, at the minimum, holds information that the action occurred,
+but it can also contain additional details about the context that accompanied this action.
+An example of context can be information about who performed the action or the state of the system at the time of the action.
+
+### Metric
+
+A single event record is not informative enough and might be caused by a coincidence.
+We need to look for sets of events sharing common traits to have a foundation for analysis.
+This is where metrics come into play. A metric is a calculation performed on pieces of information.
+For example, a single event documenting a paid user visiting the feature's page after a new feature was released tells us nothing about the success of this new feature.
+However, if we count the number of page view events happening in the week before the new feature release
+and then compare it with the number of events for the week following the feature release,
+we can derive insights about the increase in interest due to the release of the new feature.
+
+This process leads to what we call a metric. An event-based metric counts the number of times an event occurred overall or in a specified time frame.
+The same event can be used across different metrics and a metric can count either one or multiple events.
+The count can but does not have to be based on a uniqueness criterion, such as only counting distinct users who performed an event.
+
+Metrics do not have to be based on events. Metrics can also be observations about the state of a GitLab instance itself,
+such as the value of a setting or the count of rows in a database table.
+
+## Instrumentation
+
+- To instrument an event-based metric, see the [internal event tracking quick start guide](internal_event_instrumentation/quick_start.md).
+- To instrument a metric that observes the GitLab instances state, see [the metrics instrumentation](metrics/metrics_instrumentation.md).
+
+## Data flow
+
+For GitLab there is an essential difference in analytics setup between SaaS and self-managed or GitLab Dedicated instances.
+On SaaS event records are directly sent to a collection system, called Snowplow, and imported into our data warehouse.
+Self-managed and GitLab Dedicated instances record event counts locally. Every week, a process called Service Ping sends the current
+values for all pre-defined and active metrics to our data warehouse. For GitLab.com, metrics are calculated directly in the data warehouse.
+
+The following chart aims to illustrate this data flow:
+
+```mermaid
+flowchart LR;
+ feature-->track
+ track-->|send event record - only on gitlab.com|snowplow
+ track-->|increase metric counts|redis
+ database-->service_ping
+ redis-->service_ping
+ service_ping-->|json with metric values - weekly export|snowflake
+ snowplow-->|event records - continuous import|snowflake
+ snowflake-->vis
+
+ subgraph glb[Gitlab Application]
+ feature[Feature Code]
+ subgraph events[Internal Analytics Code]
+ track[track_event / trackEvent]
+ redis[(Redis)]
+ database[(Database)]
+ service_ping[\Service Ping Process\]
+ end
+ end
+ snowplow[\Snowplow Pipeline\]
+ snowflake[(Data Warehouse)]
+ vis[Dashboards in Sisense/Tableau]
+```
+
+## Data Privacy
+
+GitLab only receives event counts or similarly aggregated information from self-managed instances. User identifiers for individual events on the SaaS version of GitLab are [pseudonymized](https://metrics.gitlab.com/identifiers).
+An exact description on what kind of data is being collected through the Internal Analytics system is given in our [handbook](https://about.gitlab.com/handbook/legal/privacy/customer-product-usage-information/).