Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2022-04-20 13:00:54 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2022-04-20 13:00:54 +0300
commit3cccd102ba543e02725d247893729e5c73b38295 (patch)
treef36a04ec38517f5deaaacb5acc7d949688d1e187 /doc/development/snowplow
parent205943281328046ef7b4528031b90fbda70c75ac (diff)
Add latest changes from gitlab-org/gitlab@14-10-stable-eev14.10.0-rc42
Diffstat (limited to 'doc/development/snowplow')
-rw-r--r--doc/development/snowplow/implementation.md35
-rw-r--r--doc/development/snowplow/index.md17
-rw-r--r--doc/development/snowplow/schemas.md9
-rw-r--r--doc/development/snowplow/troubleshooting.md4
4 files changed, 50 insertions, 15 deletions
diff --git a/doc/development/snowplow/implementation.md b/doc/development/snowplow/implementation.md
index 6061a1d4cd2..162b77772f9 100644
--- a/doc/development/snowplow/implementation.md
+++ b/doc/development/snowplow/implementation.md
@@ -21,8 +21,25 @@ For the recommended frontend tracking implementation, see [Usage recommendations
Structured events and page views include the [`gitlab_standard`](schemas.md#gitlab_standard)
context, using the `window.gl.snowplowStandardContext` object which includes
[default data](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/views/layouts/_snowplow.html.haml)
-as base. This object can be modified for any subsequent structured event fired,
-although it's not recommended.
+as base:
+
+| Property | Example |
+| -------- | ------- |
+| `context_generated_at` | `"2022-01-01T01:00:00.000Z"` |
+| `environment` | `"production"` |
+| `extra` | `{}` |
+| `namespace_id` | `123` |
+| `plan` | `"gold"` |
+| `project_id` | `456` |
+| `source` | `"gitlab-rails"` |
+| `user_id` | `789`* |
+
+_\* Undergoes a pseudonymization process at the collector level._
+
+These properties [are overriden](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/assets/javascripts/tracking/get_standard_context.js)
+with frontend-specific values, like `source` (`gitlab-javascript`), `google_analytics_id`
+and the custom `extra` object. You can modify this object for any subsequent
+structured event that fires, although this is not recommended.
Tracking implementations must have an `action` and a `category`. You can provide additional
properties from the [structured event taxonomy](index.md#structured-event-taxonomy), in
@@ -396,13 +413,13 @@ Use the following arguments:
|------------|---------------------------|---------------|-----------------------------------------------------------------------------------------------------------------------------------|
| `category` | String | | Area or aspect of the application. For example, `HealthCheckController` or `Lfs::FileTransformer`. |
| `action` | String | | The action being taken. For example, a controller action such as `create`, or an Active Record callback. |
-| `label` | String | nil | The specific element or object to act on. This can be one of the following: the label of the element, for example, a tab labeled 'Create from template' for `create_from_template`; a unique identifier if no text is available, for example, `groups_dropdown_close` for closing the Groups dropdown in the top bar; or the name or title attribute of a record being created. |
-| `property` | String | nil | Any additional property of the element, or object being acted on. |
-| `value` | Numeric | nil | Describes a numeric value (decimal) directly related to the event. This could be the value of an input. For example, `10` when clicking `internal` visibility. |
-| `context` | Array\[SelfDescribingJSON\] | nil | An array of custom contexts to send with this event. Most events should not have any custom contexts. |
-| `project` | Project | nil | The project associated with the event. |
-| `user` | User | nil | The user associated with the event. |
-| `namespace` | Namespace | nil | The namespace associated with the event. |
+| `label` | String | `nil` | The specific element or object to act on. This can be one of the following: the label of the element, for example, a tab labeled 'Create from template' for `create_from_template`; a unique identifier if no text is available, for example, `groups_dropdown_close` for closing the Groups dropdown in the top bar; or the name or title attribute of a record being created. |
+| `property` | String | `nil` | Any additional property of the element, or object being acted on. |
+| `value` | Numeric | `nil` | Describes a numeric value (decimal) directly related to the event. This could be the value of an input. For example, `10` when clicking `internal` visibility. |
+| `context` | Array\[SelfDescribingJSON\] | `nil` | An array of custom contexts to send with this event. Most events should not have any custom contexts. |
+| `project` | Project | `nil` | The project associated with the event. |
+| `user` | User | `nil` | The user associated with the event. This value undergoes a pseudonymization process at the collector level. |
+| `namespace` | Namespace | `nil` | The namespace associated with the event. |
| `extra` | Hash | `{}` | Additional keyword arguments are collected into a hash and sent with the event. |
### Unit testing
diff --git a/doc/development/snowplow/index.md b/doc/development/snowplow/index.md
index 29f4514a21e..9b684757fe1 100644
--- a/doc/development/snowplow/index.md
+++ b/doc/development/snowplow/index.md
@@ -150,6 +150,23 @@ ORDER BY page_view_start DESC
LIMIT 100
```
+#### Top 20 users who fired `reply_comment_button` in the last 30 days
+
+```sql
+SELECT
+ count(*) as hits,
+ se_action,
+ se_category,
+ gsc_pseudonymized_user_id
+FROM legacy.snowplow_gitlab_events_30
+WHERE
+ se_label = 'reply_comment_button'
+ AND gsc_pseudonymized_user_id IS NOT NULL
+GROUP BY gsc_pseudonymized_user_id, se_category, se_action
+ORDER BY count(*) DESC
+LIMIT 20
+```
+
#### Query JSON formatted data
```sql
diff --git a/doc/development/snowplow/schemas.md b/doc/development/snowplow/schemas.md
index 63864c9329b..4066151600d 100644
--- a/doc/development/snowplow/schemas.md
+++ b/doc/development/snowplow/schemas.md
@@ -10,17 +10,18 @@ This page provides Snowplow schema reference for GitLab events.
## `gitlab_standard`
-We are including the [`gitlab_standard` schema](https://gitlab.com/gitlab-org/iglu/-/blob/master/public/schemas/com.gitlab/gitlab_standard/jsonschema/) with every event. See [Standardize Snowplow Schema](https://gitlab.com/groups/gitlab-org/-/epics/5218) for details.
+We are including the [`gitlab_standard` schema](https://gitlab.com/gitlab-org/iglu/-/blob/master/public/schemas/com.gitlab/gitlab_standard/jsonschema/) for structured events and page views.
The [`StandardContext`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/tracking/standard_context.rb)
-class represents this schema in the application. Some properties are automatically populated for [frontend](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/views/layouts/_snowplow.html.haml)
-events.
+class represents this schema in the application. Some properties are
+[automatically populated for frontend events](implementation.md#snowplow-javascript-frontend-tracking),
+and can be [provided manually for backend events](implementation.md#implement-ruby-backend-tracking).
| Field Name | Required | Default value | Type | Description |
|----------------|:-------------------:|-----------------------|--|---------------------------------------------------------------------------------------------|
| `project_id` | **{dotted-circle}** | Current project ID * | integer | |
| `namespace_id` | **{dotted-circle}** | Current group/namespace ID * | integer | |
-| `user_id` | **{dotted-circle}** | Current user ID * | integer | User database record ID attribute. This file undergoes a pseudonymization process at the collector level. |
+| `user_id` | **{dotted-circle}** | Current user ID * | integer | User database record ID attribute. This value undergoes a pseudonymization process at the collector level. |
| `context_generated_at` | **{dotted-circle}** | Current timestamp | string (date time format) | Timestamp indicating when context was generated. |
| `environment` | **{check-circle}** | Current environment | string (max 32 chars) | Name of the source environment, such as `production` or `staging` |
| `source` | **{check-circle}** | Event source | string (max 32 chars) | Name of the source application, such as `gitlab-rails` or `gitlab-javascript` |
diff --git a/doc/development/snowplow/troubleshooting.md b/doc/development/snowplow/troubleshooting.md
index 75c8b306a67..47d775d89aa 100644
--- a/doc/development/snowplow/troubleshooting.md
+++ b/doc/development/snowplow/troubleshooting.md
@@ -28,7 +28,7 @@ While on CloudWatch dashboard set time range to last 4 weeks, to get better pict
Drop occurring at application layer can be symptom of some issue, but it might be also a result of normal application lifecycle, intended changes done to product intelligence or experiments tracking
or even a result of a public holiday in some regions of the world with a larger user-base. To verify if there is an underlying problem to solve, you can check following things:
-1. Check `about.gitlab.com` website traffic on [Google Analytics](https://analytics.google.com/) to verify if some public holiday might impact overall use of GitLab system
+1. Check `about.gitlab.com` website traffic on [Google Analytics](https://analytics.google.com/analytics/web/) to verify if some public holiday might impact overall use of GitLab system
1. You may require to open an access request for Google Analytics access first eg: [access request internal issue](https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/1772)
1. Plot `select date(dvce_created_tstamp) , event , count(*) from legacy.snowplow_unnested_events_90 where dvce_created_tstamp > '2021-06-15' and dvce_created_tstamp < '2021-07-10' group by 1 , 2 order by 1 , 2` in SiSense to see what type of events was responsible for drop
1. Plot `select date(dvce_created_tstamp) ,se_category , count(*) from legacy.snowplow_unnested_events_90 where dvce_created_tstamp > '2021-06-15' and dvce_created_tstamp < '2021-07-31' and event = 'struct' group by 1 , 2 order by 1, 2` what events recorded the biggest drops in suspected category
@@ -47,4 +47,4 @@ Already conducted investigations:
### Troubleshooting data warehouse layer
-Reach out to [Data team](https://about.gitlab.com/handbook/business-technology/data-team) to ask about current state of data warehouse. On their handbook page there is a [section with contact details](https://about.gitlab.com/handbook/business-technology/data-team/#how-to-connect-with-us)
+Reach out to [Data team](https://about.gitlab.com/handbook/business-technology/data-team/) to ask about current state of data warehouse. On their handbook page there is a [section with contact details](https://about.gitlab.com/handbook/business-technology/data-team/#how-to-connect-with-us)