From 3cccd102ba543e02725d247893729e5c73b38295 Mon Sep 17 00:00:00 2001 From: GitLab Bot Date: Wed, 20 Apr 2022 10:00:54 +0000 Subject: Add latest changes from gitlab-org/gitlab@14-10-stable-ee --- doc/development/snowplow/implementation.md | 35 +++++++++++++++++++++-------- doc/development/snowplow/index.md | 17 ++++++++++++++ doc/development/snowplow/schemas.md | 9 ++++---- doc/development/snowplow/troubleshooting.md | 4 ++-- 4 files changed, 50 insertions(+), 15 deletions(-) (limited to 'doc/development/snowplow') diff --git a/doc/development/snowplow/implementation.md b/doc/development/snowplow/implementation.md index 6061a1d4cd2..162b77772f9 100644 --- a/doc/development/snowplow/implementation.md +++ b/doc/development/snowplow/implementation.md @@ -21,8 +21,25 @@ For the recommended frontend tracking implementation, see [Usage recommendations Structured events and page views include the [`gitlab_standard`](schemas.md#gitlab_standard) context, using the `window.gl.snowplowStandardContext` object which includes [default data](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/views/layouts/_snowplow.html.haml) -as base. This object can be modified for any subsequent structured event fired, -although it's not recommended. +as base: + +| Property | Example | +| -------- | ------- | +| `context_generated_at` | `"2022-01-01T01:00:00.000Z"` | +| `environment` | `"production"` | +| `extra` | `{}` | +| `namespace_id` | `123` | +| `plan` | `"gold"` | +| `project_id` | `456` | +| `source` | `"gitlab-rails"` | +| `user_id` | `789`* | + +_\* Undergoes a pseudonymization process at the collector level._ + +These properties [are overriden](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/assets/javascripts/tracking/get_standard_context.js) +with frontend-specific values, like `source` (`gitlab-javascript`), `google_analytics_id` +and the custom `extra` object. You can modify this object for any subsequent +structured event that fires, although this is not recommended. Tracking implementations must have an `action` and a `category`. You can provide additional properties from the [structured event taxonomy](index.md#structured-event-taxonomy), in @@ -396,13 +413,13 @@ Use the following arguments: |------------|---------------------------|---------------|-----------------------------------------------------------------------------------------------------------------------------------| | `category` | String | | Area or aspect of the application. For example, `HealthCheckController` or `Lfs::FileTransformer`. | | `action` | String | | The action being taken. For example, a controller action such as `create`, or an Active Record callback. | -| `label` | String | nil | The specific element or object to act on. This can be one of the following: the label of the element, for example, a tab labeled 'Create from template' for `create_from_template`; a unique identifier if no text is available, for example, `groups_dropdown_close` for closing the Groups dropdown in the top bar; or the name or title attribute of a record being created. | -| `property` | String | nil | Any additional property of the element, or object being acted on. | -| `value` | Numeric | nil | Describes a numeric value (decimal) directly related to the event. This could be the value of an input. For example, `10` when clicking `internal` visibility. | -| `context` | Array\[SelfDescribingJSON\] | nil | An array of custom contexts to send with this event. Most events should not have any custom contexts. | -| `project` | Project | nil | The project associated with the event. | -| `user` | User | nil | The user associated with the event. | -| `namespace` | Namespace | nil | The namespace associated with the event. | +| `label` | String | `nil` | The specific element or object to act on. This can be one of the following: the label of the element, for example, a tab labeled 'Create from template' for `create_from_template`; a unique identifier if no text is available, for example, `groups_dropdown_close` for closing the Groups dropdown in the top bar; or the name or title attribute of a record being created. | +| `property` | String | `nil` | Any additional property of the element, or object being acted on. | +| `value` | Numeric | `nil` | Describes a numeric value (decimal) directly related to the event. This could be the value of an input. For example, `10` when clicking `internal` visibility. | +| `context` | Array\[SelfDescribingJSON\] | `nil` | An array of custom contexts to send with this event. Most events should not have any custom contexts. | +| `project` | Project | `nil` | The project associated with the event. | +| `user` | User | `nil` | The user associated with the event. This value undergoes a pseudonymization process at the collector level. | +| `namespace` | Namespace | `nil` | The namespace associated with the event. | | `extra` | Hash | `{}` | Additional keyword arguments are collected into a hash and sent with the event. | ### Unit testing diff --git a/doc/development/snowplow/index.md b/doc/development/snowplow/index.md index 29f4514a21e..9b684757fe1 100644 --- a/doc/development/snowplow/index.md +++ b/doc/development/snowplow/index.md @@ -150,6 +150,23 @@ ORDER BY page_view_start DESC LIMIT 100 ``` +#### Top 20 users who fired `reply_comment_button` in the last 30 days + +```sql +SELECT + count(*) as hits, + se_action, + se_category, + gsc_pseudonymized_user_id +FROM legacy.snowplow_gitlab_events_30 +WHERE + se_label = 'reply_comment_button' + AND gsc_pseudonymized_user_id IS NOT NULL +GROUP BY gsc_pseudonymized_user_id, se_category, se_action +ORDER BY count(*) DESC +LIMIT 20 +``` + #### Query JSON formatted data ```sql diff --git a/doc/development/snowplow/schemas.md b/doc/development/snowplow/schemas.md index 63864c9329b..4066151600d 100644 --- a/doc/development/snowplow/schemas.md +++ b/doc/development/snowplow/schemas.md @@ -10,17 +10,18 @@ This page provides Snowplow schema reference for GitLab events. ## `gitlab_standard` -We are including the [`gitlab_standard` schema](https://gitlab.com/gitlab-org/iglu/-/blob/master/public/schemas/com.gitlab/gitlab_standard/jsonschema/) with every event. See [Standardize Snowplow Schema](https://gitlab.com/groups/gitlab-org/-/epics/5218) for details. +We are including the [`gitlab_standard` schema](https://gitlab.com/gitlab-org/iglu/-/blob/master/public/schemas/com.gitlab/gitlab_standard/jsonschema/) for structured events and page views. The [`StandardContext`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/tracking/standard_context.rb) -class represents this schema in the application. Some properties are automatically populated for [frontend](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/views/layouts/_snowplow.html.haml) -events. +class represents this schema in the application. Some properties are +[automatically populated for frontend events](implementation.md#snowplow-javascript-frontend-tracking), +and can be [provided manually for backend events](implementation.md#implement-ruby-backend-tracking). | Field Name | Required | Default value | Type | Description | |----------------|:-------------------:|-----------------------|--|---------------------------------------------------------------------------------------------| | `project_id` | **{dotted-circle}** | Current project ID * | integer | | | `namespace_id` | **{dotted-circle}** | Current group/namespace ID * | integer | | -| `user_id` | **{dotted-circle}** | Current user ID * | integer | User database record ID attribute. This file undergoes a pseudonymization process at the collector level. | +| `user_id` | **{dotted-circle}** | Current user ID * | integer | User database record ID attribute. This value undergoes a pseudonymization process at the collector level. | | `context_generated_at` | **{dotted-circle}** | Current timestamp | string (date time format) | Timestamp indicating when context was generated. | | `environment` | **{check-circle}** | Current environment | string (max 32 chars) | Name of the source environment, such as `production` or `staging` | | `source` | **{check-circle}** | Event source | string (max 32 chars) | Name of the source application, such as `gitlab-rails` or `gitlab-javascript` | diff --git a/doc/development/snowplow/troubleshooting.md b/doc/development/snowplow/troubleshooting.md index 75c8b306a67..47d775d89aa 100644 --- a/doc/development/snowplow/troubleshooting.md +++ b/doc/development/snowplow/troubleshooting.md @@ -28,7 +28,7 @@ While on CloudWatch dashboard set time range to last 4 weeks, to get better pict Drop occurring at application layer can be symptom of some issue, but it might be also a result of normal application lifecycle, intended changes done to product intelligence or experiments tracking or even a result of a public holiday in some regions of the world with a larger user-base. To verify if there is an underlying problem to solve, you can check following things: -1. Check `about.gitlab.com` website traffic on [Google Analytics](https://analytics.google.com/) to verify if some public holiday might impact overall use of GitLab system +1. Check `about.gitlab.com` website traffic on [Google Analytics](https://analytics.google.com/analytics/web/) to verify if some public holiday might impact overall use of GitLab system 1. You may require to open an access request for Google Analytics access first eg: [access request internal issue](https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/1772) 1. Plot `select date(dvce_created_tstamp) , event , count(*) from legacy.snowplow_unnested_events_90 where dvce_created_tstamp > '2021-06-15' and dvce_created_tstamp < '2021-07-10' group by 1 , 2 order by 1 , 2` in SiSense to see what type of events was responsible for drop 1. Plot `select date(dvce_created_tstamp) ,se_category , count(*) from legacy.snowplow_unnested_events_90 where dvce_created_tstamp > '2021-06-15' and dvce_created_tstamp < '2021-07-31' and event = 'struct' group by 1 , 2 order by 1, 2` what events recorded the biggest drops in suspected category @@ -47,4 +47,4 @@ Already conducted investigations: ### Troubleshooting data warehouse layer -Reach out to [Data team](https://about.gitlab.com/handbook/business-technology/data-team) to ask about current state of data warehouse. On their handbook page there is a [section with contact details](https://about.gitlab.com/handbook/business-technology/data-team/#how-to-connect-with-us) +Reach out to [Data team](https://about.gitlab.com/handbook/business-technology/data-team/) to ask about current state of data warehouse. On their handbook page there is a [section with contact details](https://about.gitlab.com/handbook/business-technology/data-team/#how-to-connect-with-us) -- cgit v1.2.3