Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/product_analytics')
-rw-r--r--doc/development/product_analytics/event_dictionary.md31
-rw-r--r--doc/development/product_analytics/index.md181
-rw-r--r--doc/development/product_analytics/snowplow.md8
-rw-r--r--doc/development/product_analytics/usage_ping.md144
4 files changed, 134 insertions, 230 deletions
diff --git a/doc/development/product_analytics/event_dictionary.md b/doc/development/product_analytics/event_dictionary.md
index b049db21c30..88cb75fdb83 100644
--- a/doc/development/product_analytics/event_dictionary.md
+++ b/doc/development/product_analytics/event_dictionary.md
@@ -1,32 +1,5 @@
---
-stage: Growth
-group: Product Analytics
-info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+redirect_to: 'https://about.gitlab.com/handbook/product/product-analytics-guide/'
---
-# Event Dictionary
-
-**Note: We've temporarily moved the Event Dictionary to a [Google Sheet](https://docs.google.com/spreadsheets/d/1VzE8R72Px_Y_LlE3Z05LxUlG_dumWe3vl-HeUo70TPw/edit?usp=sharing)**. The previous Markdown table exceeded 600 rows making it difficult to manage. In the future, our intention is to move this back into our docs using a [YAML file](https://gitlab.com/gitlab-org/gitlab-docs/-/issues/823).
-
-The event dictionary is a single source of truth for the metrics and events we collect for product usage data. The Event Dictionary lists all the metrics and events we track, why we're tracking them, and where they are tracked.
-
-This is a living document that is updated any time a new event is planned or implemented. It includes the following information.
-
-- Section, stage, or group
-- Description
-- Implementation status
-- Availability by plan type
-- Code path
-
-We're currently focusing our Event Dictionary on [Usage Ping](usage_ping.md). In the future, we will also include [Snowplow](snowplow.md). We currently have an initiative across the entire product organization to complete the [Event Dictionary for Usage Ping](https://gitlab.com/groups/gitlab-org/-/epics/4174).
-
-## Instructions
-
-1. Open the Event Dictionary and fill in all the **PM to edit** columns highlighted in yellow.
-1. Check that all the metrics and events are assigned to the correct section, stage, or group. If a metric is used across many groups, assign it to the stage. If a metric is used across many stages, assign it to the section. If a metric is incorrectly assigned to another section, stage, or group, let the PM know you have reassigned it. If your group has no assigned metrics and events, check that your metrics and events are not incorrectly assigned to another PM.
-1. Add descriptions of what your metrics and events are tracking. Work with your Engineering team or the Product Analytics team if you need help understanding this.
-1. Add what plans this metric is available on. Work with your Engineering team or the Product Analytics team if you need help understanding this.
-
-## Planned metrics and events
-
-For future metrics and events you plan to track, please add them to the Event Dictionary and note the status as `Planned`, `In Progress`, or `Implemented`. Once you have confirmed the metric has been implemented and have confirmed the metric data is in our data warehouse, change the status to **Data Available**.
+This document was moved to [another location](https://about.gitlab.com/handbook/product/product-analytics-guide/).
diff --git a/doc/development/product_analytics/index.md b/doc/development/product_analytics/index.md
index ab76d6f0561..88cb75fdb83 100644
--- a/doc/development/product_analytics/index.md
+++ b/doc/development/product_analytics/index.md
@@ -1,182 +1,5 @@
---
-stage: Growth
-group: Product Analytics
-info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+redirect_to: 'https://about.gitlab.com/handbook/product/product-analytics-guide/'
---
-# Product Analytics Guide
-
-At GitLab, we collect product usage data for the purpose of helping us build a better product. Data helps GitLab understand which parts of the product need improvement and which features we should build next. Product usage data also helps our team better understand the reasons why people use GitLab. With this knowledge we are able to make better product decisions.
-
-We encourage users to enable tracking, and we embrace full transparency with our tracking approach so it can be easily understood and trusted.
-
-By enabling tracking, users can:
-
-- Contribute back to the wider community.
-- Help GitLab improve on the product.
-
-## Our tracking tools
-
-We use three methods to gather product usage data:
-
-- [Snowplow](#snowplow)
-- [Usage Ping](#usage-ping)
-- [Database import](#database-import)
-
-### Snowplow
-
-Snowplow is an enterprise-grade marketing and product analytics platform which helps track the way
-users engage with our website and application.
-
-Snowplow consists of two components:
-
-- [Snowplow JS](https://github.com/snowplow/snowplow/wiki/javascript-tracker) tracks client-side
- events.
-- [Snowplow Ruby](https://github.com/snowplow/snowplow/wiki/ruby-tracker) tracks server-side events.
-
-For more details, read the [Snowplow](snowplow.md) guide.
-
-### Usage Ping
-
-Usage Ping is a method for GitLab Inc to collect usage data on a GitLab instance. Usage Ping is primarily composed of row counts for different tables in the instance’s database. By comparing these counts month over month (or week over week), we can get a rough sense for how an instance is using the different features within the product. This high-level data is used to help our product, support, and sales teams.
-
-For more details, read the [Usage Ping](usage_ping.md) guide.
-
-### Database import
-
-Database imports are full imports of data into GitLab's data warehouse. For GitLab.com, the PostgreSQL database is loaded into Snowflake data warehouse every 6 hours. For more details, see the [data team handbook](https://about.gitlab.com/handbook/business-ops/data-team/platform/#extract-and-load).
-
-## What data can be tracked
-
-Our different tracking tools allows us to track different types of events. The event types and examples of what data can be tracked are outlined below.
-
-The availability of event types and their tracking tools varies by segment. For example, on Self-Managed Users, we only have reporting using Database records via Usage Ping.
-
-| Event Types | SaaS Instance | SaaS Plan | SaaS Group | SaaS Session | SaaS User | SM Instance | SM Plan | SM Group | SM Session | SM User |
-|----------------------------------------|---------------|-----------|------------|--------------|-----------|-------------|---------|----------|------------|---------|
-| Snowplow (JS Pageview events) | βœ… | πŸ“… | πŸ“… | βœ… | πŸ“… | πŸ“… | πŸ“… | πŸ“… | πŸ“… | πŸ“… |
-| Snowplow (JS UI events) | βœ… | πŸ“… | πŸ“… | βœ… | πŸ“… | πŸ“… | πŸ“… | πŸ“… | πŸ“… | πŸ“… |
-| Snowplow (Ruby Pageview events) | βœ… | πŸ“… | πŸ“… | βœ… | πŸ“… | πŸ“… | πŸ“… | πŸ“… | πŸ“… | πŸ“… |
-| Snowplow (Ruby CRUD / API events) | βœ… | πŸ“… | πŸ“… | βœ… | πŸ“… | πŸ“… | πŸ“… | πŸ“… | πŸ“… | πŸ“… |
-| Usage Ping (Redis UI counters) | πŸ”„ | πŸ”„ | πŸ”„ | βœ–οΈ | πŸ”„ | πŸ”„ | πŸ”„ | πŸ”„ | βœ–οΈ | πŸ”„ |
-| Usage Ping (Redis Pageview counters) | πŸ”„ | πŸ”„ | πŸ”„ | βœ–οΈ | πŸ”„ | πŸ”„ | πŸ”„ | πŸ”„ | βœ–οΈ | πŸ”„ |
-| Usage Ping (Redis CRUD / API counters) | πŸ”„ | πŸ”„ | πŸ”„ | βœ–οΈ | πŸ”„ | πŸ”„ | πŸ”„ | πŸ”„ | βœ–οΈ | πŸ”„ |
-| Usage Ping (Database counters) | βœ… | πŸ”„ | πŸ“… | βœ–οΈ | βœ… | βœ… | βœ… | βœ… | βœ–οΈ | βœ… |
-| Usage Ping (Instance settings) | βœ… | πŸ”„ | πŸ“… | βœ–οΈ | βœ… | βœ… | βœ… | βœ… | βœ–οΈ | βœ… |
-| Usage Ping (Integration settings) | βœ… | πŸ”„ | πŸ“… | βœ–οΈ | βœ… | βœ… | βœ… | βœ… | βœ–οΈ | βœ… |
-| Database import (Database records) | βœ… | βœ… | βœ… | βœ–οΈ | βœ… | βœ–οΈ | βœ–οΈ | βœ–οΈ | βœ–οΈ | βœ–οΈ |
-
-[Source file](https://docs.google.com/spreadsheets/d/1e8Afo41Ar8x3JxAXJF3nL83UxVZ3hPIyXdt243VnNuE/edit?usp=sharing)
-
-**Legend**
-
-βœ… Available, πŸ”„ In Progress, πŸ“… Planned, βœ–οΈ Not Possible
-
-SaaS = GitLab.com. SM = Self-Managed instance
-
-### Pageview events
-
-- Number of sessions that visited the /dashboard/groups page
-
-### UI events
-
-- Number of sessions that clicked on a button or link
-- Number of sessions that closed a modal
-
-UI events are any interface-driven actions from the browser including click data.
-
-### CRUD or API events
-
-- Number of Git pushes
-- Number of GraphQL queries
-- Number of requests to a Rails action or controller
-
-These are backend events that include the creation, read, update, deletion of records, and other events that might be triggered from layers other than those available in the interface.
-
-### Database records
-
-These are raw database records which can be explored using business intelligence tools like Sisense. The full list of available tables can be found in [structure.sql](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/structure.sql).
-
-### Instance settings
-
-These are settings of your instance such as the instance's Git version and if certain features are enabled such as `container_registry_enabled`.
-
-### Integration settings
-
-These are integrations your GitLab instance interacts with such as an [external storage provider](../../administration/static_objects_external_storage.md) or an [external container registry](../../administration/packages/container_registry.md#use-an-external-container-registry-with-gitlab-as-an-auth-endpoint). These services must be able to send data back into a GitLab instance for data to be tracked.
-
-## Reporting level
-
-Our reporting levels of aggregate or individual reporting varies by segment. For example, on Self-Managed Users, we can report at an aggregate user level using Usage Ping but not on an Individual user level.
-
-| Aggregated Reporting | SaaS Instance | SaaS Plan | SaaS Group | SaaS Session | SaaS User | SM Instance | SM Plan | SM Group | SM Session | SM User |
-|----------------------|---------------|-----------|------------|--------------|-----------|-------------|---------|----------|------------|---------|
-| Snowplow | βœ… | πŸ“… | πŸ“… | βœ… | πŸ“… | βœ… | πŸ“… | πŸ“… | βœ… | πŸ“… |
-| Usage Ping | βœ… | πŸ”„ | πŸ“… | πŸ“… | βœ… | βœ… | βœ… | βœ… | πŸ“… | βœ… |
-| Database import | βœ… | βœ… | βœ… | βœ–οΈ | βœ… | βœ–οΈ | βœ–οΈ | βœ–οΈ | βœ–οΈ | βœ–οΈ |
-
-| Identifiable Reporting | SaaS Instance | SaaS Plan | SaaS Group | SaaS Session | SaaS User | SM Instance | SM Plan | SM Group | SM Session | SM User |
-|------------------------|---------------|-----------|------------|--------------|-----------|-------------|---------|----------|------------|---------|
-| Snowplow | βœ… | πŸ“… | πŸ“… | βœ… | πŸ“… | βœ–οΈ | βœ–οΈ | βœ–οΈ | βœ–οΈ | βœ–οΈ |
-| Usage Ping | βœ… | πŸ”„ | πŸ“… | βœ–οΈ | βœ–οΈ | βœ… | βœ… | βœ–οΈ | βœ–οΈ | βœ–οΈ |
-| Database import | βœ… | βœ… | βœ… | βœ–οΈ | βœ… | βœ–οΈ | βœ–οΈ | βœ–οΈ | βœ–οΈ | βœ–οΈ |
-
-**Legend**
-
-βœ… Available, πŸ”„ In Progress, πŸ“… Planned, βœ–οΈ Not Possible
-
-SaaS = GitLab.com. SM = Self-Managed instance
-
-## Reporting time period
-
-Our reporting time periods varies by segment. For example, on Self-Managed Users, we can report all time counts and 28 day counts in Usage Ping.
-
-| Reporting Time Period | All Time | 28 Days | 7 Days | Daily |
-|-----------------------|----------|---------|--------|-------|
-| Snowplow | βœ… | βœ… | βœ… | βœ… |
-| Usage Ping | βœ… | βœ… | πŸ“… | βœ–οΈ |
-| Database import | βœ… | βœ… | βœ… | βœ… |
-
-**Legend**
-
-βœ… Available, πŸ”„ In Progress, πŸ“… Planned, βœ–οΈ Not Possible
-
-## Systems overview
-
-The systems overview is a simplified diagram showing the interactions between GitLab Inc and self-managed instances.
-
-![Product Analytics Overview](../img/telemetry_system_overview.png)
-
-[Source file](https://app.diagrams.net/#G13DVpN-XnhWGz9tqReIj8pp1UE4ehk_EC)
-
-### GitLab Inc
-
-For Product Analytics purposes, GitLab Inc has three major components:
-
-1. [Data Infrastructure](https://about.gitlab.com/handbook/business-ops/data-team/platform/infrastructure/): This contains everything managed by our data team including Sisense Dashboards for visualization, Snowflake for Data Warehousing, incoming data sources such as PostgreSQL Pipeline and S3 Bucket, and lastly our data collectors [GitLab.com's Snowplow Collector](https://gitlab.com/gitlab-com/gl-infra/readiness/-/tree/master/library/snowplow/) and GitLab's Versions Application.
-1. GitLab.com: This is the production GitLab application which is made up of a Client and Server. On the Client or browser side, a Snowplow JS Tracker (Frontend) is used to track client-side events. On the Server or application side, a Snowplow Ruby Tracker (Backend) is used to track server-side events. The server also contains Usage Ping which leverages a PostgreSQL database and a Redis in-memory data store to report on usage data. Lastly, the server also contains System Logs which are generated from running the GitLab application.
-1. [Monitoring infrastructure](https://about.gitlab.com/handbook/engineering/monitoring/): This is the infrastructure used to ensure GitLab.com is operating smoothly. System Logs are sent from GitLab.com to our monitoring infrastructure and collected by a FluentD collector. From FluentD, logs are either sent to long term Google Cloud Services cold storage via Stackdriver, or, they are sent to our Elastic Cluster via Cloud Pub/Sub which can be explored in real-time using Kibana.
-
-### Self-managed
-
-For Product Analytics purposes, self-managed instances have two major components:
-
-1. Data infrastructure: Having a data infrastructure setup is optional on self-managed instances. If you'd like to collect Snowplow tracking events for your self-managed instance, you can setup your own self-managed Snowplow collector and configure your Snowplow events to point to your own collector.
-1. GitLab: A self-managed GitLab instance contains all of the same components as GitLab.com mentioned above.
-
-### Differences between GitLab Inc and Self-managed
-
-As shown by the orange lines, on GitLab.com Snowplow JS, Snowplow Ruby, Usage Ping, and PostgreSQL database imports all flow into GitLab Inc's data infrastructure. However, on self-managed, only Usage Ping flows into GitLab Inc's data infrastructure.
-
-As shown by the green lines, on GitLab.com system logs flow into GitLab Inc's monitoring infrastructure. On self-managed, there are no logs sent to GitLab Inc's monitoring infrastructure.
-
-Note (1): Snowplow JS and Snowplow Ruby are available on self-managed, however, the Snowplow Collector endpoint is set to a self-managed Snowplow Collector which GitLab Inc does not have access to.
-
-## Additional information
-
-More useful links:
-
-- [Product Analytics Direction](https://about.gitlab.com/direction/product-analytics/)
-- [Data Analysis Process](https://about.gitlab.com/handbook/business-ops/data-team/#data-analysis-process/)
-- [Data for Product Managers](https://about.gitlab.com/handbook/business-ops/data-team/programs/data-for-product-managers/)
-- [Data Infrastructure](https://about.gitlab.com/handbook/business-ops/data-team/platform/infrastructure/)
+This document was moved to [another location](https://about.gitlab.com/handbook/product/product-analytics-guide/).
diff --git a/doc/development/product_analytics/snowplow.md b/doc/development/product_analytics/snowplow.md
index 21d92566ffd..c5f48994d5c 100644
--- a/doc/development/product_analytics/snowplow.md
+++ b/doc/development/product_analytics/snowplow.md
@@ -10,7 +10,7 @@ This guide provides an overview of how Snowplow works, and implementation detail
For more information about Product Analytics, see:
-- [Product Analytics Guide](index.md)
+- [Product Analytics Guide](https://about.gitlab.com/handbook/product/product-analytics-guide/)
- [Usage Ping Guide](usage_ping.md)
More useful links:
@@ -52,7 +52,7 @@ Tracking can be enabled at:
- The instance level, which enables tracking on both the frontend and backend layers.
- User level, though user tracking can be disabled on a per-user basis. GitLab tracking respects the [Do Not Track](https://www.eff.org/issues/do-not-track) standard, so any user who has enabled the Do Not Track option in their browser is not tracked at a user level.
-We utilize Snowplow for the majority of our tracking strategy and it is enabled on GitLab.com. On a self-managed instance, Snowplow can be enabled by navigating to:
+We use Snowplow for the majority of our tracking strategy and it is enabled on GitLab.com. On a self-managed instance, Snowplow can be enabled by navigating to:
- **Admin Area > Settings > General** in the UI.
- `admin/application_settings/integrations` in your browser.
@@ -112,7 +112,7 @@ The current method provides several attributes that are sent on each click event
## Implementing Snowplow JS (Frontend) tracking
-GitLab provides `Tracking`, an interface that wraps the [Snowplow JavaScript Tracker](https://github.com/snowplow/snowplow/wiki/javascript-tracker) for tracking custom events. There are a few ways to utilize tracking, but each generally requires at minimum, a `category` and an `action`. Additional data can be provided that adheres to our [Structured event taxonomy](#structured-event-taxonomy).
+GitLab provides `Tracking`, an interface that wraps the [Snowplow JavaScript Tracker](https://github.com/snowplow/snowplow/wiki/javascript-tracker) for tracking custom events. There are a few ways to use tracking, but each generally requires at minimum, a `category` and an `action`. Additional data can be provided that adheres to our [Structured event taxonomy](#structured-event-taxonomy).
| field | type | default value | description |
|:-----------|:-------|:---------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
@@ -294,7 +294,7 @@ Custom event tracking and instrumentation can be added by directly calling the `
| `action` | string | 'generic' | The action being taken, which can be anything from a controller action like `create` to something like an Active Record callback. |
| `data` | object | {} | Additional data such as `label`, `property`, `value`, and `context` as described in [Structured event taxonomy](#structured-event-taxonomy). These are set as empty strings if you don't provide them. |
-Tracking can be viewed as either tracking user behavior, or can be utilized for instrumentation to monitor and visualize performance over time in an area or aspect of code.
+Tracking can be viewed as either tracking user behavior, or can be used for instrumentation to monitor and visualize performance over time in an area or aspect of code.
For example:
diff --git a/doc/development/product_analytics/usage_ping.md b/doc/development/product_analytics/usage_ping.md
index d482af77d8a..fa785d934cb 100644
--- a/doc/development/product_analytics/usage_ping.md
+++ b/doc/development/product_analytics/usage_ping.md
@@ -15,7 +15,7 @@ This guide describes Usage Ping's purpose and how it's implemented.
For more information about Product Analytics, see:
-- [Product Analytics Guide](index.md)
+- [Product Analytics Guide](https://about.gitlab.com/handbook/product/product-analytics-guide/)
- [Snowplow Guide](snowplow.md)
More useful links:
@@ -270,7 +270,7 @@ Implemented using Redis methods [PFADD](https://redis.io/commands/pfadd) and [PF
##### Adding new events
-1. Define events in [`known_events.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events.yml).
+1. Define events in [`known_events`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/).
Example event:
@@ -312,6 +312,7 @@ Implemented using Redis methods [PFADD](https://redis.io/commands/pfadd) and [PF
- `aggregation`: aggregation `:daily` or `:weekly`. The argument defines how we build the Redis
keys for data storage. For `daily` we keep a key for metric per day of the year, for `weekly` we
keep a key for metric per week of the year.
+ - `feature_flag`: optional. For details, see our [GitLab internal Feature flags](../feature_flags/) documentation.
1. Track event in controller using `RedisTracking` module with `track_redis_hll_event(*controller_actions, name:, feature:, feature_default_enabled: false)`.
@@ -402,7 +403,7 @@ Implemented using Redis methods [PFADD](https://redis.io/commands/pfadd) and [PF
| `event` | string | yes | The event name it should be tracked |
Response
-w
+
Return 200 if tracking failed for any reason.
- `200` if event was tracked or any errors
@@ -412,7 +413,7 @@ w
1. Track events using JavaScript/Vue API helper which calls the API above
- Example usage for an existing event already defined in [known events](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events.yml):
+ Example usage for an existing event already defined in [known events](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/):
Note that `usage_data_api` and `usage_data_#{event_name}` should be enabled in order to be able to track events
@@ -422,20 +423,46 @@ w
api.trackRedisHllUserEvent('my_already_defined_event_name'),
```
-1. Track event using base module `Gitlab::UsageDataCounters::HLLRedisCounter.track_event(entity_id, event_name)`.
+1. Track event using base module `Gitlab::UsageDataCounters::HLLRedisCounter.track_event(values, event_name)`.
+
+ Arguments:
+
+ - `values`: One value or array of values we count. For example: user_id, visitor_id, user_ids.
+ - `event_name`: event name.
+
+1. Track event on context level using base module `Gitlab::UsageDataCounters::HLLRedisCounter.track_event_in_context(entity_id, event_name, context)`.
Arguments:
- `entity_id`: value we count. For example: user_id, visitor_id.
- `event_name`: event name.
+ - `context`: context value. Allowed values are `default`, `free`, `bronze`, `silver`, `gold`, `starter`, `premium`, `ultimate`
-1. Get event data using `Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names:, start_date:, end_date)`.
+1. Get event data using `Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names:, start_date:, end_date:, context: '')`.
Arguments:
- `event_names`: the list of event names.
- `start_date`: start date of the period for which we want to get event data.
- `end_date`: end date of the period for which we want to get event data.
+ - `context`: context of the event. Allowed values are `default`, `free`, `bronze`, `silver`, `gold`, `starter`, `premium`, `ultimate`.
+
+1. Testing tracking and getting unique events
+
+Trigger events in rails console by using `track_event` method
+
+ ```ruby
+ Gitlab::UsageDataCounters::HLLRedisCounter.track_event(1, 'g_compliance_audit_events')
+ Gitlab::UsageDataCounters::HLLRedisCounter.track_event(2, 'g_compliance_audit_events')
+ ```
+
+Next, get the unique events for the current week.
+
+ ```ruby
+ # Get unique events for metric for current_week
+ Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'g_compliance_audit_events',
+ start_date: Date.current.beginning_of_week, end_date: Date.current.end_of_week)
+ ```
Recommendations:
@@ -445,9 +472,23 @@ Recommendations:
- Use a [feature flag](../../operations/feature_flags.md) to have a control over the impact when
adding new metrics.
+##### Enable/Disable Redis HLL tracking
+
+Events are tracked behind [feature flags](../feature_flags/index.md) due to concerns for Redis performance and scalability.
+
+For a full list of events and coresponding feature flags see, [known_events](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/) files.
+
+To enable or disable tracking for specific event within <https://gitlab.com> or <https://staging.gitlab.com>, run commands such as the following to
+[enable or disable the corresponding feature](../feature_flags/index.md).
+
+```shell
+/chatops run feature set <feature_name> true
+/chatops run feature set <feature_name> false
+```
+
##### Known events in usage data payload
-All events added in [`known_events.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events.yml) are automatically added to usage data generation under the `redis_hll_counters` key. This column is stored in [version-app as a JSON](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/db/schema.rb#L209).
+All events added in [`known_events/common.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/common.yml) are automatically added to usage data generation under the `redis_hll_counters` key. This column is stored in [version-app as a JSON](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/db/schema.rb#L209).
For each event we add metrics for the weekly and monthly time frames, and totals for each where applicable:
- `#{event_name}_weekly`: Data for 7 days for daily [aggregation](#adding-new-events) events and data for the last complete week for weekly [aggregation](#adding-new-events) events.
@@ -493,7 +534,7 @@ Example usage:
redis_usage_data(Gitlab::UsageDataCounters::WikiPageCounter)
redis_usage_data { ::Gitlab::UsageCounters::PodLogs.usage_totals[:total] }
-# Define events in known_events.yml https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events.yml
+# Define events in common.yml https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/common.yml
# Tracking events
Gitlab::UsageDataCounters::HLLRedisCounter.track_event(visitor_id, 'expand_vulnerabilities')
@@ -548,7 +589,17 @@ for how to use its API to query for data.
## Developing and testing Usage Ping
-### 1. Use your Rails console to manually test counters
+### 1. Naming and placing the metrics
+
+Add the metric in one of the top level keys
+
+- `license`: for license related metrics.
+- `settings`: for settings related metrics.
+- `counts_weekly`: for counters that have data for the most recent 7 days.
+- `counts_monthly`: for counters that have data for the most recent 28 days.
+- `counts`: for counters that have data for all time.
+
+### 2. Use your Rails console to manually test counters
```ruby
# count
@@ -560,7 +611,7 @@ Gitlab::UsageData.distinct_count(::Project, :creator_id)
Gitlab::UsageData.distinct_count(::Note.with_suggestions.where(time_period), :author_id, start: ::User.minimum(:id), finish: ::User.maximum(:id))
```
-### 2. Generate the SQL query
+### 3. Generate the SQL query
Your Rails console will return the generated SQL queries.
@@ -574,7 +625,7 @@ pry(main)> Gitlab::UsageData.count(User.active)
(1.9ms) SELECT COUNT("users"."id") FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (6, 4)) AND "users"."id" BETWEEN 1 AND 100000
```
-### 3. Optimize queries with #database-lab
+### 4. Optimize queries with #database-lab
Paste the SQL query into `#database-lab` to see how the query performs at scale.
@@ -601,27 +652,27 @@ We also use `#database-lab` and [explain.depesz.com](https://explain.depesz.com/
- Avoid joins and write the queries as simply as possible, [example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/36316).
- Set a custom `batch_size` for `distinct_count`, [example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/38000).
-### 4. Add the metric definition
+### 5. Add the metric definition
-When adding, changing, or updating metrics, please update the [Event Dictionary's **Usage Ping** table](event_dictionary.md).
+When adding, changing, or updating metrics, please update the [Event Dictionary's **Usage Ping** table](https://about.gitlab.com/handbook/product/product-analytics-guide#event-dictionary).
-### 5. Add new metric to Versions Application
+### 6. Add new metric to Versions Application
Check if new metrics need to be added to the Versions Application. See `usage_data` [schema](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/db/schema.rb#L147) and usage data [parameters accepted](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/app/services/usage_ping.rb). Any metrics added under the `counts` key are saved in the `stats` column.
-### 6. Add the feature label
+### 7. Add the feature label
Add the `feature` label to the Merge Request for new Usage Ping metrics. These are user-facing changes and are part of expanding the Usage Ping feature.
-### 7. Add a changelog file
+### 8. Add a changelog file
Ensure you comply with the [Changelog entries guide](../changelog.md).
-### 8. Ask for a Product Analytics Review
+### 9. Ask for a Product Analytics Review
On GitLab.com, we have DangerBot setup to monitor Product Analytics related files and DangerBot will recommend a Product Analytics review. Mention `@gitlab-org/growth/product_analytics/engineers` in your MR for a review.
-### 9. Verify your metric
+### 10. Verify your metric
On GitLab.com, the Product Analytics team regularly monitors Usage Ping. They may alert you that your metrics need further optimization to run quicker and with greater success. You may also use the [Usage Ping QA dashboard](https://app.periscopedata.com/app/gitlab/632033/Usage-Ping-QA) to check how well your metric performs. The dashboard allows filtering by GitLab version, by "Self-managed" & "Saas" and shows you how many failures have occurred for each metric. Whenever you notice a high failure rate, you may re-optimize your metric.
@@ -671,6 +722,59 @@ with any of the other services that are running. That is not how node metrics ar
always runs as a process alongside other GitLab components on any given node. From Usage Ping's perspective none of the node data would therefore
appear to be associated to any of the services running, since they all appear to be running on different hosts. To alleviate this problem, the `node_exporter` in GCK was arbitrarily "assigned" to the `web` service, meaning only for this service `node_*` metrics will appear in Usage Ping.
+## Aggregated metrics
+
+> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/45979) in GitLab 13.6.
+> - It's [deployed behind a feature flag](../../user/feature_flags.md), disabled by default.
+> - It's enabled on GitLab.com.
+
+CAUTION: **Warning:**
+This feature is intended solely for internal GitLab use.
+
+In order to add data for aggregated metrics into Usage Ping payload you should add corresponding definition into [`aggregated_metrics.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/aggregated_metrics.yml) file. Each aggregate definition includes following parts:
+
+- name: unique name under which aggregate metric will be added to Usage Ping payload
+- operator: operator that defines how aggregated metric data will be counted. Available operators are:
+ - `OR`: removes duplicates and counts all entries that triggered any of listed events
+ - `AND`: removes duplicates and counts all elements that were observed triggering all of following events
+- events: list of events names (from [`known_events.yml`](#known-events-in-usage-data-payload)) to aggregate into metric. All events in this list must have the same `redis_slot` and `aggregation` attributes.
+- feature_flag: name of [development feature flag](../feature_flags/development.md#development-type) that will be checked before
+metrics aggregation is performed. Corresponding feature flag should have `default_enabled` attribute set to `false`.
+`feature_flag` attribute is **OPTIONAL** and can be omitted, when `feature_flag` is missing no feature flag will be checked.
+
+Example aggregated metric entries:
+
+```yaml
+- name: product_analytics_test_metrics_union
+ operator: OR
+ events: ['i_search_total', 'i_search_advanced', 'i_search_paid']
+- name: product_analytics_test_metrics_intersection_with_feautre_flag
+ operator: AND
+ events: ['i_search_total', 'i_search_advanced', 'i_search_paid']
+ feature_flag: example_aggregated_metric
+```
+
+Aggregated metrics will be added under `aggregated_metrics` key in both `counts_weekly` and `counts_monthly` top level keys in Usage Ping payload.
+
+```ruby
+{
+ :counts_monthly => {
+ :deployments => 1003,
+ :successful_deployments => 78,
+ :failed_deployments => 275,
+ :packages => 155,
+ :personal_snippets => 2106,
+ :project_snippets => 407,
+ :promoted_issues => 719,
+ :aggregated_metrics => {
+ :product_analytics_test_metrics_union => 7,
+ :product_analytics_test_metrics_intersection_with_feautre_flag => 2
+ },
+ :snippets => 2513
+ }
+}
+```
+
## Example Usage Ping payload
The following is example content of the Usage Ping payload.
@@ -935,3 +1039,7 @@ bin/rake gitlab:usage_data:dump_sql_in_json
# You may pipe the output into a file
bin/rake gitlab:usage_data:dump_sql_in_yaml > ~/Desktop/usage-metrics-2020-09-02.yaml
```
+
+## Generating and troubleshooting usage ping
+
+To get a usage ping, or to troubleshoot caching issues on your GitLab instance, please follow [instructions to generate usage ping](../../administration/troubleshooting/gitlab_rails_cheat_sheet.md#generate-usage-ping).