diff options
Diffstat (limited to 'doc/development/value_stream_analytics.md')
-rw-r--r-- | doc/development/value_stream_analytics.md | 178 |
1 files changed, 126 insertions, 52 deletions
diff --git a/doc/development/value_stream_analytics.md b/doc/development/value_stream_analytics.md index 0d545fa8e3f..bbea89d5645 100644 --- a/doc/development/value_stream_analytics.md +++ b/doc/development/value_stream_analytics.md @@ -6,33 +6,86 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Value stream analytics development guide -Value stream analytics calculates the time between two arbitrary events recorded on domain objects and provides aggregated statistics about the duration. +For information on how to configure value stream analytics (VSA) in GitLab, see our [analytics documentation](../user/analytics/value_stream_analytics.md). -For information on how to configure value stream analytics in GitLab, see our [analytics documentation](../user/analytics/value_stream_analytics.md). +## How does Value Stream Analytics work? -## Stage +Value Stream Analytics calculates the duration between two timestamp columns or timestamp +expressions and runs various aggregations on the data. -During development, events occur that move issues and merge requests through different stages of progress until they are considered finished. These stages can be expressed with the `Stage` model. +For example: -Example stage: +- Duration between the Merge Request creation time and Merge Request merge time. +- Duration between the Issue creation time and Issue close time. -- Name: Development -- Start event: Issue created -- End event: Issue first mentioned in commit -- Parent: `Group: gitlab-org` +This duration is exposed in various ways: + +- Aggregation: median, average +- Listing: list the duration for individual Merge Request and Issue records + +Apart from the durations, we expose the record count within a stage. + +## Feature availability + +- Group level (licensed): Requires Ultimate or Premium subscription. This version is the most +feature-full. +- Project level (licensed): We are continually adding features to project level VSA to bring it in line with group level VSA. +- Project level (FOSS): Keep it as is. + +|Feature|Group level (licensed)|Project level (licensed)|Project level (FOSS)| +|-|-|-|-| +|Create custom value streams|Yes|No, only one value stream (default) is present with the default stages|no, only one value stream (default) is present with the default stages| +|Create custom stages|Yes|No|No| +|Filtering (author, label, milestone, etc.)|Yes|Yes|Yes| +|Stage time chart|Yes|No|No| +|Total time chart|Yes|No|No| +|Task by type chart|Yes|No|No| +|DORA Metrics|Yes|Yes|No| +|Cycle time and lead time summary (Key metrics)|Yes|Yes|No| +|New issues, commits and deploys (Key metrics)|Yes, excluding commits|Yes|Yes| +|Uses aggregated backend|Yes|No|No| +|Date filter behavior|Filters items [finished within the date range](https://gitlab.com/groups/gitlab-org/-/epics/6046)|Filters items by creation date.|Filters items by creation date.| +|Authorization|At least reporter|At least reporter|Can be public.| + +## VSA core domain objects + +### Stages + +A stage represents an event pair (start and end events) with additional metadata, such as the name +of the stage. Stages are configurable by the user within the pairing rules defined in the backend. + +**Example stage: Code Review** + +- Start event identifier: Merge request creation time. +- Start event column: uses the `merge_requests.created_at` timestamp column. +- End event identifier: Merge request merge time. +- End event column: uses the `merge_request_metrics.merged_at` timestamp column. +- Stage event hash ID: a calculated hash for the pair of start and end event identifiers. + - If two stages have the same configuration of start and end events, then their stage event hash. + IDs are identical. + - The stage event hash ID is later used to store the aggregated data in partitioned database tables. + +Historically, value stream analytics defined [7 stages](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/analytics/cycle_analytics/default_stages.rb) +which are always available to the end-users regardless of the subscription. + +### Value streams + +Value streams are container objects for the stages. There can be multiple value streams per group +focusing on different aspects of the Dev Ops lifecycle. ### Events Events are the smallest building blocks of the value stream analytics feature. A stage consists of two events: -- Start -- End +- Start event +- End event These events play a key role in the duration calculation. Formula: `duration = end_event_time - start_event_time` -To make the duration calculation flexible, each `Event` is implemented as a separate class. They're responsible for defining a timestamp expression that is used in the calculation query. +To make the duration calculation flexible, each `Event` is implemented as a separate class. +They're responsible for defining a timestamp expression that is used in the calculation query. #### Implementing an `Event` class @@ -81,7 +134,7 @@ def apply_query_customization(query) end ``` -### Validating start and end events +#### Validating start and end events Some start/end event pairs are not "compatible" with each other. For example: @@ -171,23 +224,7 @@ graph LR; MergeRequestLabelRemoved --> MergeRequestLabelRemoved; ``` -### Parent - -Teams and organizations might define their own way of building software, thus stages can be completely different. For each stage, a parent object needs to be defined. - -Currently supported parents: - -- `Project` -- `Group` - -#### How parent relationship it work - -1. User navigates to the value stream analytics page. -1. User selects a group. -1. Backend loads the defined stages for the selected group. -1. Additions and modifications to the stages are persisted within the selected group only. - -### Default stages +## Default stages The [original implementation](https://gitlab.com/gitlab-org/gitlab/-/issues/847) of value stream analytics defined 7 stages. These stages are always available for each parent, however altering these stages is not possible. @@ -209,31 +246,15 @@ The reason for this was that we'd like to add the abilities to hide and order st For a new calculation or a query, implement it as a new method call in the `DataCollector` class. -## Database query - -Structure of the database query: - -```sql -SELECT (customized by: Median or RecordsFetcher or DataForDurationChart) -FROM OBJECT_TYPE (Issue or MergeRequest) -INNER JOIN (several JOIN statements, depending on the events) -WHERE - (Filter by the PARENT model, example: filter Issues from Project A) - (Date range filter based on the OBJECT_TYPE.created_at) - (Check if the START_EVENT is earlier than END_EVENT, preventing negative duration) -``` - -Structure of the `SELECT` statement for `Median`: +To support the aggregated value stream analytics backend, these classes were reimplemented within [`Aggregated`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/analytics/cycle_analytics/aggregated) namespace. -```sql -SELECT (calculate median from START_EVENT_TIME-END_EVENT_TIME) -``` +### Database query backend -Structure of the `SELECT` statement for `DataForDurationChart`: +VSA supports two backends: [aggregated](value_stream_analytics/value_stream_analytics_aggregated_backend.md) and "live". The live query backend can be +considered legacy, which will be phased out at some point. -```sql -SELECT (START_EVENT_TIME-END_EVENT_TIME) as duration, END_EVENT.timestamp -``` +- "live": uses the standard `IssuableFinders`. +- aggregated: queries data from pre-aggregated database tables. ## High-level overview @@ -244,6 +265,31 @@ SELECT (START_EVENT_TIME-END_EVENT_TIME) as duration, END_EVENT.timestamp - Responsible for composing queries and define feature specific business logic. - `DataCollector`, `Event`, `StageEvents`, etc. +## Frontend + +[Project VSA](../user/analytics/value_stream_analytics.md) is available for all users and: + +- Includes a mixture of key and DORA metrics based on the tier. +- Uses the set of [default stages](#default-stages). + +[Group VSA](../user/group/value_stream_analytics/index.md) is only available for licensed users and extends project VSA to include: + +- An [overview stage](https://gitlab.com/gitlab-org/gitlab/-/issues/321438). +- The ability to create custom value streams. + +The group and project level VSA frontends are both built with Vue and Vuex and follow a similar pattern: + +- The `index.js` file extracts any URL query parameters, creates the Vue app and Vuex store, and dispatches an `initialize` Vuex action. +- The `base.vue` file is used to render the main components for each page, metrics, filters, charts, and the stage table. + +The group VSA Vuex store makes use of [Vuex modules](https://vuex.vuejs.org/guide/modules.html) to separate some of the state and logic used for rendering the charts. + +### Shared components + +Parts of the UI are shared between project VSA and group VSA such as the stage table and path. These shared components live in the project VSA directory `app/assets/javascripts/cycle_analytics/components` and are included at the group level VSA where needed. + +All the frontend code for group-level features are located in `ee/app/assets/javascripts/analytics/cycle_analytics/components`. + ## Testing Since we have a lots of events and possible pairings, testing each pairing is not possible. The rule is to have at least one test case using an `Event` class. @@ -252,3 +298,31 @@ Writing a test case for a stage using a new `Event` can be challenging since dat - Different parents: `Group` or `Project` - Different calculations: `Median`, `RecordsFetcher` or `DataForDurationChart` + +The VSA frontend is tested extensively on two different levels (integration, unit): + +- End-to-end integration tests using a real backend via Capybara and RSpec. +- Jest frontend tests with pre-generated data fixtures. + +## Development setup and testing + +Running Value Stream Analytics can be done via the [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit). By default, you'll be able to view the project-level (FOSS) version of the feature. + +If your GDK is up and running, you can run the seed script to generate some data: + +```shell +SEED_CYCLE_ANALYTICS=true SEED_VSA=true FILTER=cycle_analytics rake db:seed_fu +``` + +The data generator script creates a new group and a new project with issue and merge request +data (see the output of the script). To view the group-level version of the feature, you +need to request a license for your GDK instance. + +After this step, you can access the group level value stream analytics page where you can create +value streams and stages. The data aggregation might be delayed so you might not see the +data right after the stage creation. To speed up this process, you can run the following command +in your rails console (`rails c`): + +```ruby +Analytics::CycleAnalytics::ReaggregationWorker.new.perform +``` |