Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/user/storage_management_automation.md')
-rw-r--r--doc/user/storage_management_automation.md1145
1 files changed, 1145 insertions, 0 deletions
diff --git a/doc/user/storage_management_automation.md b/doc/user/storage_management_automation.md
new file mode 100644
index 00000000000..210aca4ee35
--- /dev/null
+++ b/doc/user/storage_management_automation.md
@@ -0,0 +1,1145 @@
+---
+type: howto
+stage: Fulfillment
+group: Utilization
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
+---
+
+# Storage management automation **(FREE ALL)**
+
+You can manage your storage through the GitLab UI and the API. This page describes how to
+automate storage analysis and cleanup to manage your [usage quota](usage_quotas.md). You can also
+manage your storage usage by making your pipelines more efficient. For more information, see [pipeline efficiency](../ci/pipelines/pipeline_efficiency.md).
+
+You can also use the [GitLab community forum and Discord](https://about.gitlab.com/community/) to ask for help with API automation.
+
+## API requirements
+
+To automate storage management, your GitLab.com SaaS or self-managed instance must have access to the [GitLab REST API](../api/api_resources.md).
+
+### API authentication scope
+
+You must use the following scopes to [authenticate](../api/rest/index.md#authentication) with the API:
+
+- Storage analysis:
+ - Read API access with the `read_api` scope.
+ - At least the Developer role on all projects.
+- Storage clean up:
+ - Full API access with the `api` scope.
+ - At least the Maintainer role on all projects.
+
+You can use command line tools or a programming language to interact with the REST API.
+
+### Command line
+
+You must install the following tools to send API requests:
+
+- Install `curl` with your preferred package manager.
+- Install the [GitLab CLI](../editor_extensions/gitlab_cli/index.md) and use the `api` subcommand.
+- Install `jq` to format JSON responses. For more information, see [Tips for productive DevOps workflows: JSON formatting with jq and CI/CD linting automation](https://about.gitlab.com/blog/2021/04/21/devops-workflows-json-format-jq-ci-cd-lint/).
+
+Example with `curl` and `jq`:
+
+```shell
+export GITLAB_TOKEN=xxx
+
+curl --silent --header "Authorization: Bearer $GITLAB_TOKEN" "https://gitlab.com/api/v4/user" | jq
+```
+
+Example with the [GitLab CLI](../editor_extensions/gitlab_cli/index.md):
+
+```shell
+glab auth login
+
+glab api groups/YOURGROUPNAME/projects
+```
+
+#### Using the GitLab CLI
+
+Some API endpoints require [pagination](../api/rest/index.md#pagination) and subsequent page fetches to retrieve all results. The [GitLab CLI](../editor_extensions/gitlab_cli/index.md) provides the flag `--paginate`.
+
+Requests that require sending a POST body formatted as JSON data can be written as `key=value` pairs passed to the `--raw-field` parameter.
+
+For more information, see the [GitLab CLI endpoint documentation](../editor_extensions/gitlab_cli/index.md#core-commands).
+
+### API client libraries
+
+The storage management and cleanup automation methods described in this page use the [`python-gitlab`](https://python-gitlab.readthedocs.io/en/stable/) library in programmatic example. The `python-gitlab` library provides
+a feature-rich programming interface. For more information about use cases for the `python-gitlab` library,
+see [Efficient DevSecOps workflows: Hands-on `python-gitlab` API automation](https://about.gitlab.com/blog/2023/02/01/efficient-devsecops-workflows-hands-on-python-gitlab-api-automation/).
+
+For more information about other API client libraries, see [Third-party clients](../api/rest/index.md#third-party-clients).
+
+NOTE:
+Use [GitLab Duo Code Suggestions](project/repository/code_suggestions.md) to write code more efficiently.
+
+## Strategies for storage analysis
+
+### Identify the storage types
+
+The [projects API endpoint](../api/projects.md#list-all-projects) provides statistics for projects
+in your GitLab instance. To use the projects API endpoint, set the `statistics` key to boolean `true`.
+This data provides insight into storage consumption of the project by the following storage types:
+
+- `storage_size`: Overall storage
+- `lfs_objects_size`: LFS objects storage
+- `job_artifacts_size`: Job artifacts storage
+- `packages_size`: Packages storage
+- `repository_size`: Git repository storage
+- `snippets_size`: Snippets storage
+- `uploads_size`: Uploads storage
+- `wiki_size`: Wiki storage
+
+Additional queries are required for detailed storage statistics for [job artifacts](../api/job_artifacts.md), the [container registry](../api/container_registry.md), the [package registry](../api/packages.md) and [dependency proxy](../api/dependency_proxy.md). It is explained later in this how-to.
+
+Example that uses `curl` and `jq` on the command line:
+
+```shell
+curl --silent --header "Authorization: Bearer $GITLAB_TOKEN" "https://gitlab.com/api/v4/projects/$GL_PROJECT_ID?statistics=true" | jq --compact-output '.id,.statistics' | jq
+48349590
+{
+ "commit_count": 2,
+ "storage_size": 90241770,
+ "repository_size": 3521,
+ "wiki_size": 0,
+ "lfs_objects_size": 0,
+ "job_artifacts_size": 90238249,
+ "pipeline_artifacts_size": 0,
+ "packages_size": 0,
+ "snippets_size": 0,
+ "uploads_size": 0
+}
+```
+
+Example that uses the [GitLab CLI](../editor_extensions/gitlab_cli/index.md):
+
+```shell
+export GL_PROJECT_ID=48349590
+glab api --method GET projects/$GL_PROJECT_ID --field 'statistics=true' | jq --compact-output '.id,.statistics' | jq
+48349590
+{
+ "commit_count": 2,
+ "storage_size": 90241770,
+ "repository_size": 3521,
+ "wiki_size": 0,
+ "lfs_objects_size": 0,
+ "job_artifacts_size": 90238249,
+ "pipeline_artifacts_size": 0,
+ "packages_size": 0,
+ "snippets_size": 0,
+ "uploads_size": 0
+}
+```
+
+Example using the `python-gitlab` library:
+
+```python
+project_obj = gl.projects.get(project.id, statistics=True)
+
+print("Project {n} statistics: {s}".format(n=project_obj.name_with_namespace, s=json.dump(project_obj.statistics, indent=4)))
+```
+
+You can find an example implementation in the script `get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py` which is located in the [GitLab API with Python project](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/). Export the `GL_GROUP_ID` environment variable and run the script to see the project statistics printed in the terminal.
+
+```shell
+export GL_TOKEN=xxx
+export GL_GROUP_ID=56595735
+
+pip3 install python-gitlab
+python3 get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py
+
+Project Developer Evangelism and Technical Marketing at GitLab / playground / Artifact generator group / Gen Job Artifacts 4 statistics: {
+ "commit_count": 2,
+ "storage_size": 90241770,
+ "repository_size": 3521,
+ "wiki_size": 0,
+ "lfs_objects_size": 0,
+ "job_artifacts_size": 90238249,
+ "pipeline_artifacts_size": 0,
+ "packages_size": 0,
+ "snippets_size": 0,
+ "uploads_size": 0
+}
+```
+
+### Analyzing multiple subgroups and projects
+
+You can use automation to analyze multiple projects and groups. For example, you can start at the top namespace level,
+and recursively analyze all subgroups and projects. You can also analyze different storage types.
+
+Here's an example of an algorithm that analyzes multiple subgroups and projects:
+
+1. Fetch the top-level namespace ID. You can copy the ID value from the [namespace/group overview](../user/namespace/index.md#types-of-namespaces).
+1. Fetch all [subgroups](../api/groups.md#list-a-groups-subgroups) from the top-level group, and save the IDs in a list.
+1. Loop over all groups and fetch all [projects from each group](../api/groups.md#list-a-groups-projects) and save the IDs in a list.
+1. Identify the storage type to analyze, and collect the information from project attributes, like project statistics, and job artifacts.
+1. Print an overview of all projects, grouped by group, and their storage information.
+
+Example with the [GitLab CLI](../editor_extensions/gitlab_cli/index.md):
+
+```shell
+export GROUP_NAME="gitlab-de"
+
+# Return sub group IDs
+glab api groups/$GROUP_NAME/subgroups | jq --compact-output '.[]' | jq --compact-output '.id'
+12034712
+67218622
+67162711
+67640130
+16058698
+12034604
+
+# Loop over all subgroups to get subgroups, until the result set is empty. Example group: 12034712
+glab api groups/12034712/subgroups | jq --compact-output '.[]' | jq --compact-output '.id'
+56595735
+70677315
+67218606
+70812167
+
+# Lowest group level
+glab api groups/56595735/subgroups | jq --compact-output '.[]' | jq --compact-output '.id'
+# empty result, return and continue with analysis
+
+# Fetch projects from all collected groups. Example group: 56595735
+glab api groups/56595735/projects | jq --compact-output '.[]' | jq --compact-output '.id'
+48349590
+48349263
+38520467
+38520405
+
+# Fetch storage types from a project (ID 48349590): Job artifacts in the `artifacts` key
+glab api projects/48349590/jobs | jq --compact-output '.[]' | jq --compact-output '.id, .artifacts'
+4828297946
+[{"file_type":"archive","size":52444993,"filename":"artifacts.zip","file_format":"zip"},{"file_type":"metadata","size":156,"filename":"metadata.gz","file_format":"gzip"},{"file_type":"trace","size":3140,"filename":"job.log","file_format":null}]
+4828297945
+[{"file_type":"archive","size":20978113,"filename":"artifacts.zip","file_format":"zip"},{"file_type":"metadata","size":157,"filename":"metadata.gz","file_format":"gzip"},{"file_type":"trace","size":3147,"filename":"job.log","file_format":null}]
+4828297944
+[{"file_type":"archive","size":10489153,"filename":"artifacts.zip","file_format":"zip"},{"file_type":"metadata","size":158,"filename":"metadata.gz","file_format":"gzip"},{"file_type":"trace","size":3146,"filename":"job.log","file_format":null}]
+4828297943
+[{"file_type":"archive","size":5244673,"filename":"artifacts.zip","file_format":"zip"},{"file_type":"metadata","size":157,"filename":"metadata.gz","file_format":"gzip"},{"file_type":"trace","size":3145,"filename":"job.log","file_format":null}]
+4828297940
+[{"file_type":"archive","size":1049089,"filename":"artifacts.zip","file_format":"zip"},{"file_type":"metadata","size":157,"filename":"metadata.gz","file_format":"gzip"},{"file_type":"trace","size":3140,"filename":"job.log","file_format":null}]
+```
+
+While the shell approach with `glab` works for smaller analysis, you should consider a script that
+uses the API client libraries. This improves readability, storing data, flow control, testing, and reusability.
+
+You can also implement this algorithm with a Python script that uses the `python-gitlab` library:
+
+```python
+#!/usr/bin/env python
+
+import datetime
+import gitlab
+import os
+import sys
+
+GITLAB_SERVER = os.environ.get('GL_SERVER', 'https://gitlab.com')
+GITLAB_TOKEN = os.environ.get('GL_TOKEN') # token requires developer permissions
+PROJECT_ID = os.environ.get('GL_PROJECT_ID') #optional
+GROUP_ID = os.environ.get('GL_GROUP_ID') #optional
+
+if __name__ == "__main__":
+ if not GITLAB_TOKEN:
+ print("🤔 Please set the GL_TOKEN env variable.")
+ sys.exit(1)
+
+ gl = gitlab.Gitlab(GITLAB_SERVER, private_token=GITLAB_TOKEN, pagination="keyset", order_by="id", per_page=100)
+
+ # Collect all projects, or prefer projects from a group id, or a project id
+ projects = []
+
+ # Direct project ID
+ if PROJECT_ID:
+ projects.append(gl.projects.get(PROJECT_ID))
+ # Groups and projects inside
+ elif GROUP_ID:
+ group = gl.groups.get(GROUP_ID)
+
+ for project in group.projects.list(include_subgroups=True, get_all=True):
+ manageable_project = gl.projects.get(project.id , lazy=True)
+ projects.append(manageable_project)
+
+ for project in projects:
+ jobs = project.jobs.list(pagination="keyset", order_by="id", per_page=100, iterator=True)
+ for job in jobs:
+ print("DEBUG: ID {i}: {a}".format(i=job.id, a=job.attributes['artifacts']))
+```
+
+The script outputs the project job artifacts in a JSON formatted list:
+
+```json
+[
+ {
+ "file_type": "archive",
+ "size": 1049089,
+ "filename": "artifacts.zip",
+ "file_format": "zip"
+ },
+ {
+ "file_type": "metadata",
+ "size": 157,
+ "filename": "metadata.gz",
+ "file_format": "gzip"
+ },
+ {
+ "file_type": "trace",
+ "size": 3146,
+ "filename": "job.log",
+ "file_format": null
+ }
+]
+```
+
+The full script `get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py` with specific examples for automating storage management and cleanup is located is located in the [GitLab API with Python](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/) project. To ensure the script doesn't reach [API rate limits](../api/rest/index.md#rate-limits), the example code is not optimized for parallel API requests.
+
+### Helper functions
+
+You may need to convert timestamp seconds into a duration format, or print raw bytes in a more
+representative format. You can use the following helper functions to transform values for improved
+readability:
+
+```shell
+# Current Unix timestamp
+date +%s
+
+# Convert `created_at` date time with timezone to Unix timestamp
+date -d '2023-08-08T18:59:47.581Z' +%s
+```
+
+Example with Python that uses the `python-gitlab` API library:
+
+```python
+def render_size_mb(v):
+ return "%.4f" % (v / 1024 / 1024)
+
+def render_age_time(v):
+ return str(datetime.timedelta(seconds = v))
+
+# Convert `created_at` date time with timezone to Unix timestamp
+def calculate_age(created_at_datetime):
+ created_at_ts = datetime.datetime.strptime(created_at_datetime, '%Y-%m-%dT%H:%M:%S.%fZ')
+ now = datetime.datetime.now()
+ return (now - created_at_ts).total_seconds()
+```
+
+## Managing storage in CI/CD pipelines
+
+WARNING:
+Deleting job log and artifacts is a destructive action that cannot be reverted. Use with caution. Deleting certain files, including report artifacts, job logs, and metadata files, affects GitLab features that use these files as data sources.
+
+Job artifacts consume most of the pipeline storage, and job logs can also generate several hundreds of kilobytes.
+You should delete the unnecessary job artifacts first and then clean up job logs after analysis.
+
+### Analyze pipeline storage
+
+The following example shows a response from a query for job artifacts in a project:
+
+```json
+[
+ {
+ "file_type": "archive",
+ "size": 1049089,
+ "filename": "artifacts.zip",
+ "file_format": "zip"
+ },
+ {
+ "file_type": "metadata",
+ "size": 157,
+ "filename": "metadata.gz",
+ "file_format": "gzip"
+ },
+ {
+ "file_type": "trace",
+ "size": 3146,
+ "filename": "job.log",
+ "file_format": null
+ }
+]
+```
+
+The [Job API endpoint](../api/jobs.md#list-project-jobs) returns the job artifacts `file_type` key in the `artifacts` attribute. The the job artifacts `file_type` key provides insights into the specific artifact type:
+
+- `archive` is used for the generated job artifacts as a zip file.
+- `metadata` is used for additional metadata in a Gzip file.
+- `trace` is used for the `job.log` as a raw file.
+
+These three types are relevant for storage counting, and should be collected for a later summary. Based on the example code for fetching all projects, you can extend the Python script to do more analysis.
+
+The Python code loops over all projects, and fetches a `project_obj` object variable that contains all attributes. Because there can be many pipelines and jobs, fetching the list of jobs can be expensive in one call. Therefore, this is done using [keyset pagination](https://python-gitlab.readthedocs.io/en/stable/api-usage.html#pagination). The remaining step is to fetch the `artifacts` attribute from the `job` object.
+
+Based on how you implement the script, you could either:
+
+- Collect all job artifacts and print a summary table at the end of the script.
+- Print the information immediately.
+
+Collecting the job artifacts provides a data structure that can be written as a cache file to
+disk for example, which you can use when testing the implementation.
+
+In the following example, the job artifacts are collected in the `ci_job_artifacts` list.
+
+```python
+ ci_job_artifacts = []
+
+ for project in projects:
+ project_obj = gl.projects.get(project.id)
+
+ jobs = project.jobs.list(pagination="keyset", order_by="id", per_page=100, iterator=True)
+
+ for job in jobs:
+ artifacts = job.attributes['artifacts']
+ #print("DEBUG: ID {i}: {a}".format(i=job.id, a=json.dumps(artifacts, indent=4)))
+ if not artifacts:
+ continue
+
+ for a in artifacts:
+ data = {
+ "project_id": project_obj.id,
+ "project_web_url": project_obj.name,
+ "project_path_with_namespace": project_obj.path_with_namespace,
+ "job_id": job.id,
+ "artifact_filename": a['filename'],
+ "artifact_file_type": a['file_type'],
+ "artifact_size": a['size']
+ }
+
+ ci_job_artifacts.append(data)
+
+ print("\nDone collecting data.")
+
+ if len(ci_job_artifacts) > 0:
+ print("|Project|Job|Artifact name|Artifact type|Artifact size|\n|-|-|-|-|-|") #Start markdown friendly table
+ for artifact in ci_job_artifacts:
+ print('| [{project_name}]({project_web_url}) | {job_name} | {artifact_name} | {artifact_type} | {artifact_size} |'.format(project_name=artifact['project_path_with_namespace'], project_web_url=artifact['project_web_url'], job_name=artifact['job_id'], artifact_name=artifact['artifact_filename'], artifact_type=artifact['artifact_file_type'], artifact_size=render_size_mb(artifact['artifact_size'])))
+ else:
+ print("No artifacts found.")
+```
+
+At the end of the script, the job artifacts are printed as a Markdown formatted table. You can copy the table content into a new issue comment or description, or populate a Markdown file in a GitLab repository.
+
+```shell
+$ python3 get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py
+
+|Project|Job|Artifact name|Artifact type|Artifact size|
+|-|-|-|-|-|
+| [gitlab-de/playground/artifact-gen-group/gen-job-artifacts-4](Gen Job Artifacts 4) | 4828297946 | artifacts.zip | archive | 50.0154 |
+| [gitlab-de/playground/artifact-gen-group/gen-job-artifacts-4](Gen Job Artifacts 4) | 4828297946 | metadata.gz | metadata | 0.0001 |
+| [gitlab-de/playground/artifact-gen-group/gen-job-artifacts-4](Gen Job Artifacts 4) | 4828297946 | job.log | trace | 0.0030 |
+| [gitlab-de/playground/artifact-gen-group/gen-job-artifacts-4](Gen Job Artifacts 4) | 4828297945 | artifacts.zip | archive | 20.0063 |
+| [gitlab-de/playground/artifact-gen-group/gen-job-artifacts-4](Gen Job Artifacts 4) | 4828297945 | metadata.gz | metadata | 0.0001 |
+| [gitlab-de/playground/artifact-gen-group/gen-job-artifacts-4](Gen Job Artifacts 4) | 4828297945 | job.log | trace | 0.0030 |
+```
+
+The full example of the script `get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py` is located in the [GitLab API with Python project](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/). To ensure the script doesn't hit [API rate limits](../api/rest/index.md#rate-limits), the example code is not optimized for parallel API requests.
+
+### Delete job artifacts
+
+You can use a filter to select the types of job artifacts to delete in bulk. A typical request:
+
+- Deletes job artifacts older than the specified number of days.
+- Deletes job artifacts that exceed a specified amount of storage. For example, 100 MB.
+
+You can use a Python script to implement this type of filter. You can filter the API queries results, and compare
+the `created_at` value to calculate the artifact age.
+
+You can also loop over all job artifacts and compare their `size` attribute to see whether they match
+the size threshold. When a matching job has been found, it is marked for deletion. Because of the
+analysis that happens when the script loops through job attributes, the job can be marked as deleted
+only. When the collection loops remove the object locks, all marked as deleted jobs can actually be deleted.
+
+```python
+ for project in projects:
+ project_obj = gl.projects.get(project.id)
+
+ jobs = project.jobs.list(pagination="keyset", order_by="id", per_page=100, iterator=True)
+
+ for job in jobs:
+ artifacts = job.attributes['artifacts']
+ if not artifacts:
+ continue
+
+ # Advanced filtering: Age and Size
+ # Example: 90 days, 10 MB threshold (TODO: Make this configurable)
+ threshold_age = 90 * 24 * 60 * 60
+ threshold_size = 10 * 1024 * 1024
+
+ # job age, need to parse API format: 2023-08-08T22:41:08.270Z
+ created_at = datetime.datetime.strptime(job.created_at, '%Y-%m-%dT%H:%M:%S.%fZ')
+ now = datetime.datetime.now()
+ age = (now - created_at).total_seconds()
+ # Shorter: Use a function
+ # age = calculate_age(job.created_at)
+
+ for a in artifacts:
+ # ... removed analysis collection code for readability
+
+ # Advanced filtering: match job artifacts age and size against thresholds
+ if (float(age) > float(threshold_age)) or (float(a['size']) > float(threshold_size)):
+ # mark job for deletion (cannot delete inside the loop)
+ jobs_marked_delete_artifacts.append(job)
+
+ print("\nDone collecting data.")
+
+ # Advanced filtering: Delete all job artifacts marked to being deleted.
+ for job in jobs_marked_delete_artifacts:
+ # delete the artifact
+ print("DEBUG", job)
+ job.delete_artifacts()
+
+ # Print collection summary (removed for readability)
+```
+
+The full example of the script `get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py` is located in the [GitLab API Python project](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/).
+
+#### Delete all job artifacts for a project
+
+If you do not need the project's [job artifacts](../ci/jobs/job_artifacts.md), you can
+use the following command to delete them all. This action cannot be reverted.
+
+Job artifact deletion happens asynchronously in GitLab and can take a while to complete in the background. Subsequent analysis queries against the API can still return the artifacts as a false-positive result. Artifact deletion can take minutes or hours, depending on the artifacts to delete. To avoid confusion with results, do not run immediate additional API requests.
+
+The [artifacts for the most recent successful jobs](../ci/jobs/job_artifacts.md#keep-artifacts-from-most-recent-successful-jobs) are also kept by default.
+
+Example with curl:
+
+```shell
+export GL_PROJECT_ID=48349590
+
+curl --silent --header "Authorization: Bearer $GITLAB_TOKEN" --request DELETE "https://gitlab.com/api/v4/projects/$GL_PROJECT_ID/artifacts"
+```
+
+Example with the [GitLab CLI](../editor_extensions/gitlab_cli/index.md):
+
+```shell
+glab api --method GET projects/$GL_PROJECT_ID/jobs | jq --compact-output '.[]' | jq --compact-output '.id, .artifacts'
+
+glab api --method DELETE projects/$GL_PROJECT_ID/artifacts
+```
+
+Example with the [`python-gitlab` library](https://python-gitlab.readthedocs.io/en/stable/gl_objects/pipelines_and_jobs.html#jobs):
+
+```python
+ project.artifacts.delete()
+```
+
+### Delete job logs
+
+When you delete a job log you also [erase the entire job](../api/jobs.md#erase-a-job).
+
+Example with the [GitLab CLI](../editor_extensions/gitlab_cli/index.md):
+
+```shell
+glab api --method GET projects/$GL_PROJECT_ID/jobs | jq --compact-output '.[]' | jq --compact-output '.id'
+
+4836226184
+4836226183
+4836226181
+4836226180
+
+glab api --method POST projects/$GL_PROJECT_ID/jobs/4836226180/erase | jq --compact-output '.name,.status'
+"generate-package: [1]"
+"success"
+```
+
+In the `python-gitlab` API library, you must use [`job.erase()`](https://python-gitlab.readthedocs.io/en/stable/gl_objects/pipelines_and_jobs.html#jobs) instead of `job.delete_artifacts()`.
+To avoid this API call from being blocked, set the script to sleep for a short amount of time between calls
+that delete the job artifact.
+
+```python
+ for job in jobs_marked_delete_artifacts:
+ # delete the artifacts and job log
+ print("DEBUG", job)
+ #job.delete_artifacts()
+ job.erase()
+ # Sleep for 1 second
+ time.sleep(1)
+```
+
+The full example of the script `get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py` is located in the [GitLab API with Python project](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/).
+
+Support for creating a retention policy for job logs is proposed in [issue 374717](https://gitlab.com/gitlab-org/gitlab/-/issues/374717).
+
+### Inventory of job artifacts expiry settings
+
+To manage artifact storage, you can update or configure when an artifact expires.
+The expiry setting for artifacts are configured in each job configuration in the `.gitlab-ci.yml`.
+
+If you have multiple projects, and depending on how job definitions are organized in the CI/CD configuration, it may be difficult to locate the expiry setting. You can use a script to search the entire CI/CD configuration. This includes access to objects that are resolved after inheriting values, like `extends` or `!reference`.
+The script retrieves merged CI/CD configuration files and searches for the artifacts key to:
+
+- Identify the jobs that don't have an expiry setting.
+- Return the expiry setting for jobs that have the artifact expiry configured.
+
+The following process describes how the script searches for the artifact expiry setting:
+
+1. To generate a merged CI/CD configuration, the script loops over all projects and calls
+ the [`ci_lint()`](https://python-gitlab.readthedocs.io/en/stable/gl_objects/ci_lint.html) method.
+1. The `yaml_load` function loads the merged configuration into Python data structures for more analysis.
+1. A dictionary that also has the key `script` identifies itself as a job definition, where the `artifacts`
+ key might exists.
+1. If yes, the script parses the sub key `expire_in` and stores the details to print later in a Markdown table summary.
+
+```python
+ ci_job_artifacts_expiry = {}
+
+ # Loop over projects, fetch .gitlab-ci.yml, run the linter to get the full translated config, and extract the `artifacts:` setting
+ # https://python-gitlab.readthedocs.io/en/stable/gl_objects/ci_lint.html
+ for project in projects:
+ project_obj = gl.projects.get(project.id)
+ project_name = project_obj.name
+ project_web_url = project_obj.web_url
+ try:
+ lint_result = project.ci_lint.get()
+ if lint_result.merged_yaml is None:
+ continue
+
+ ci_pipeline = yaml.safe_load(lint_result.merged_yaml)
+ #print("Project {p} Config\n{c}\n\n".format(p=project_name, c=json.dumps(ci_pipeline, indent=4)))
+
+ for k in ci_pipeline:
+ v = ci_pipeline[k]
+ # This is a job object with `script` attribute
+ if isinstance(v, dict) and 'script' in v:
+ print(".", end="", flush=True) # Get some feedback that it is still looping
+ artifacts = v['artifacts'] if 'artifacts' in v else {}
+
+ print("Project {p} job {j} artifacts {a}".format(p=project_name, j=k, a=json.dumps(artifacts, indent=4)))
+
+ expire_in = None
+ if 'expire_in' in artifacts:
+ expire_in = artifacts['expire_in']
+
+ store_key = project_web_url + '_' + k
+ ci_job_artifacts_expiry[store_key] = { 'project_web_url': project_web_url,
+ 'project_name': project_name,
+ 'job_name': k,
+ 'artifacts_expiry': expire_in}
+
+ except Exception as e:
+ print(f"Exception searching artifacts on ci_pipelines: {e}".format(e=e))
+
+ if len(ci_job_artifacts_expiry) > 0:
+ print("|Project|Job|Artifact expiry|\n|-|-|-|") #Start markdown friendly table
+ for k, details in ci_job_artifacts_expiry.items():
+ if details['job_name'][0] == '.':
+ continue # ignore job templates that start with a '.'
+ print(f'| [{ details["project_name"] }]({details["project_web_url"]}) | { details["job_name"] } | { details["artifacts_expiry"] if details["artifacts_expiry"] is not None else "❌ N/A" } |')
+```
+
+The script generates a Markdown summary table with project name and URL, job name, and the `artifacts:expire_in` setting, or `N/A` if not existing. It does not print job templates starting with a `.` character which are not instantiated as runtime job objects that would generate artifacts.
+
+```shell
+export GL_GROUP_ID=56595735
+
+# Script requires pyyaml too.
+pip3 install python-gitlab pyyaml
+
+python3 get_all_cicd_config_artifacts_expiry.py
+
+|Project|Job|Artifact expiry|
+|-|-|-|
+| [Gen Job Artifacts 4](https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-4) | generator | 30 days |
+| [Gen Job Artifacts with expiry and included jobs](https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-expiry-included-jobs) | included-job10 | 10 days |
+| [Gen Job Artifacts with expiry and included jobs](https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-expiry-included-jobs) | included-job1 | 1 days |
+| [Gen Job Artifacts with expiry and included jobs](https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-expiry-included-jobs) | included-job30 | 30 days |
+| [Gen Job Artifacts with expiry and included jobs](https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-expiry-included-jobs) | generator | 30 days |
+| [Gen Job Artifacts 2](https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-2) | generator | ❌ N/A |
+| [Gen Job Artifacts 1](https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-1) | generator | ❌ N/A |
+```
+
+The `get_all_cicd_config_artifacts_expiry.py` script is located in the [GitLab API with Python project](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/).
+
+Alternatively, you can use [advanced search](search/advanced_search.md) with API requests. The following example uses the [scope: blobs](../api/search.md#scope-blobs-premium-all-2) to searches for the string `artifacts` in all `*.yml` files:
+
+```shell
+# https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-expiry-included-jobs
+export GL_PROJECT_ID=48349263
+
+glab api --method GET projects/$GL_PROJECT_ID/search --field "scope=blobs" --field "search=expire_in filename:*.yml"
+```
+
+For more information about the inventory approach, see [How GitLab can help mitigate deletion of open source container images on Docker Hub](https://about.gitlab.com/blog/2023/03/16/how-gitlab-can-help-mitigate-deletion-open-source-images-docker-hub/).
+
+### Set the default expiry for job artifacts in projects
+
+Based on the output of the `get_all_cicd_config_artifacts_expiry.py` script, you can define the [default artifact expiration](../ci/yaml/index.md#default) in your `.gitlab-ci.yml` configuration.
+
+```yaml
+default:
+ artifacts:
+ expire_in: 1 week
+```
+
+### Delete old pipelines
+
+Pipelines do not add to the overall storage consumption, but if you want to delete them you can use the following methods.
+
+Example using the [GitLab CLI](../editor_extensions/gitlab_cli/index.md):
+
+```shell
+export GL_PROJECT_ID=48349590
+
+glab api --method GET projects/$GL_PROJECT_ID/pipelines | jq --compact-output '.[]' | jq --compact-output '.id,.created_at'
+960031926
+"2023-08-08T22:09:52.745Z"
+959884072
+"2023-08-08T18:59:47.581Z"
+
+glab api --method DELETE projects/$GL_PROJECT_ID/pipelines/960031926
+
+glab api --method GET projects/$GL_PROJECT_ID/pipelines | jq --compact-output '.[]' | jq --compact-output '.id,.created_at'
+959884072
+"2023-08-08T18:59:47.581Z"
+```
+
+The `created_at` key must be converted from a timestamp to Unix epoch time,
+for example with `date -d '2023-08-08T18:59:47.581Z' +%s`. In the next step, the
+age can be calculated with the difference between now, and the pipeline creation
+date. If the age is larger than the threshold, the pipeline should be deleted.
+
+The following example uses a Bash script that expects `jq` and the [GitLab CLI](../editor_extensions/gitlab_cli/index.md) installed, and authorized, and the exported environment variable `GL_PROJECT_ID`.
+
+The full script `get_cicd_pipelines_compare_age_threshold_example.sh` is located in the [GitLab API with Linux Shell](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-linux-shell) project.
+
+```shell
+#/bin/bash
+
+CREATED_AT_ARR=$(glab api --method GET projects/$GL_PROJECT_ID/pipelines | jq --compact-output '.[]' | jq --compact-output '.created_at' | jq --raw-output @sh)
+
+for row in ${CREATED_AT_ARR[@]}
+do
+ stripped=$(echo $row | xargs echo)
+ #echo $stripped #DEBUG
+
+ CREATED_AT_TS=$(date -d "$stripped" +%s)
+ NOW=$(date +%s)
+
+ AGE=$(($NOW-$CREATED_AT_TS))
+ AGE_THRESHOLD=$((90*24*60*60)) # 90 days
+
+ if [ $AGE -gt $AGE_THRESHOLD ];
+ then
+ echo "Pipeline age $AGE older than threshold $AGE_THRESHOLD, should be deleted."
+ # TODO call glab to delete the pipeline. Needs an ID collected from the glab call above.
+ else
+ echo "Pipeline age $AGE not older than threshold $AGE_THRESHOLD. Ignore."
+ fi
+done
+```
+
+You can use the [`python-gitlab` API library](https://python-gitlab.readthedocs.io/en/stable/gl_objects/pipelines_and_jobs.html#project-pipelines) and
+the `created_at` attribute to implement a similar algorithm that compares the job artifact age:
+
+```python
+ # ...
+
+ for pipeline in project.pipelines.list(iterator=True):
+ pipeline_obj = project.pipelines.get(pipeline.id)
+ print("DEBUG: {p}".format(p=json.dumps(pipeline_obj.attributes, indent=4)))
+
+ created_at = datetime.datetime.strptime(pipeline.created_at, '%Y-%m-%dT%H:%M:%S.%fZ')
+ now = datetime.datetime.now()
+ age = (now - created_at).total_seconds()
+
+ threshold_age = 90 * 24 * 60 * 60
+
+ if (float(age) > float(threshold_age)):
+ print("Deleting pipeline", pipeline.id)
+ pipeline_obj.delete()
+```
+
+The full example of the script `get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py` is located in the [GitLab API with Python project](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/).
+
+Automatically deleting old pipelines in GitLab is tracked in [this feature proposal](https://gitlab.com/gitlab-org/gitlab/-/issues/338480).
+
+## Manage storage for Container Registries
+
+Container registries are available [in a project](../api/container_registry.md#within-a-project) or [in a group](../api/container_registry.md#within-a-group). Both locations require analysis and cleanup strategies.
+
+The following example uses using `curl` and `jq` for a project:
+
+```shell
+export GL_PROJECT_ID=48057080
+
+curl --silent --header "Authorization: Bearer $GITLAB_TOKEN" "https://gitlab.com/api/v4/projects/$GL_PROJECT_ID/registry/repositories" | jq --compact-output '.[]' | jq --compact-output '.id,.location' | jq
+4435617
+"registry.gitlab.com/gitlab-de/playground/container-package-gen-group/docker-alpine-generator"
+
+curl --silent --header "Authorization: Bearer $GITLAB_TOKEN" "https://gitlab.com/api/v4/registry/repositories/4435617?size=true" | jq --compact-output '.id,.location,.size'
+4435617
+"registry.gitlab.com/gitlab-de/playground/container-package-gen-group/docker-alpine-generator"
+3401613
+```
+
+The following example uses the [GitLab CLI](../editor_extensions/gitlab_cli/index.md) for a project:
+
+```shell
+export GL_PROJECT_ID=48057080
+
+glab api --method GET projects/$GL_PROJECT_ID/registry/repositories | jq --compact-output '.[]' | jq --compact-output '.id,.location'
+4435617
+"registry.gitlab.com/gitlab-de/playground/container-package-gen-group/docker-alpine-generator"
+
+glab api --method GET registry/repositories/4435617 --field='size=true' | jq --compact-output '.id,.location,.size'
+4435617
+"registry.gitlab.com/gitlab-de/playground/container-package-gen-group/docker-alpine-generator"
+3401613
+
+glab api --method GET projects/$GL_PROJECT_ID/registry/repositories/4435617/tags | jq --compact-output '.[]' | jq --compact-output '.name'
+"latest"
+
+glab api --method GET projects/$GL_PROJECT_ID/registry/repositories/4435617/tags/latest | jq --compact-output '.name,.created_at,.total_size'
+"latest"
+"2023-08-07T19:20:20.894+00:00"
+3401613
+```
+
+A similar automation shell script is created in the [delete old pipelines](#delete-old-pipelines) section.
+
+The `python-gitlab` API library provides bulk deletion interfaces explained in the next section.
+
+### Delete container images in bulk
+
+When you [delete container image tags in bulk](../api/container_registry.md#delete-registry-repository-tags-in-bulk),
+you can configure:
+
+- The matching regular expressions for tag names and images to keep (`name_regex_keep`) or delete (`name_regex_delete`)
+- The number of image tags to keep matching the tag name (`keep_n`)
+- The number of days before an image tag can be deleted (`older_than`)
+
+WARNING:
+On GitLab.com, due to the scale of the Container Registry, the number of tags deleted by this API is limited.
+If your Container Registry has a large number of tags to delete, only some of them are deleted. You might need
+to call the API multiple times. To schedule tags for automatic deletion, use a [cleanup policy](#cleanup-policy-for-containers) instead.
+
+The following example uses the [`python-gitlab` API library](https://python-gitlab.readthedocs.io/en/stable/gl_objects/repository_tags.html) to fetch a list of tags, and calls the `delete_in_bulk()` method with filter parameters.
+
+```python
+ repositories = project.repositories.list(iterator=True, size=True)
+ if len(repositories) > 0:
+ repository = repositories.pop()
+ tags = repository.tags.list()
+
+ # Cleanup: Keep only the latest tag
+ repository.tags.delete_in_bulk(keep_n=1)
+ # Cleanup: Delete all tags older than 1 month
+ repository.tags.delete_in_bulk(older_than="1m")
+ # Cleanup: Delete all tags matching the regex `v.*`, and keep the latest 2 tags
+ repository.tags.delete_in_bulk(name_regex_delete="v.+", keep_n=2)
+```
+
+The full example of the script `get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py` is located
+in the [GitLab API with Python](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/) project.
+
+### Cleanup policy for containers
+
+Use the project REST API endpoint to [create cleanup policies](packages/container_registry/reduce_container_registry_storage.md#use-the-cleanup-policy-api). The following example uses the [GitLab CLI](../editor_extensions/gitlab_cli/index.md) to create a cleanup policy.
+
+To send the attributes as a body parameter, you must:
+
+- Use the `--input -` parameter to read from the standard input.
+- Set the `Content-Type` header.
+
+```shell
+export GL_PROJECT_ID=48057080
+
+echo '{"container_expiration_policy_attributes":{"cadence":"1month","enabled":true,"keep_n":1,"older_than":"14d","name_regex":".*","name_regex_keep":".*-main"}}' | glab api --method PUT --header 'Content-Type: application/json;charset=UTF-8' projects/$GL_PROJECT_ID --input -
+
+...
+
+ "container_expiration_policy": {
+ "cadence": "1month",
+ "enabled": true,
+ "keep_n": 1,
+ "older_than": "14d",
+ "name_regex": ".*",
+ "name_regex_keep": ".*-main",
+ "next_run_at": "2023-09-08T21:16:25.354Z"
+ },
+
+```
+
+After you set up the cleanup policy, all container images that match your specifications are deleted automatically. You do not need additional API automation scripts.
+
+### Optimize container images
+
+You can optimize container images to reduce the image size and overall storage consumption in the container registry. Learn more in the [pipeline efficiency documentation](../ci/pipelines/pipeline_efficiency.md#optimize-docker-images).
+
+## Manage storage for Package Registry
+
+Package registries are available [in a project](../api/packages.md#within-a-project) or [in a group](../api/packages.md#within-a-group).
+
+### List packages and files
+
+The following example shows fetching packages from a defined project ID using the [GitLab CLI](../editor_extensions/gitlab_cli/index.md). The result set is an array of dictionary items that can be filtered with the `jq` command chain.
+
+```shell
+# https://gitlab.com/gitlab-de/playground/container-package-gen-group/generic-package-generator
+export GL_PROJECT_ID=48377643
+
+glab api --method GET projects/$GL_PROJECT_ID/packages | jq --compact-output '.[]' | jq --compact-output '.id,.name,.package_type'
+16669383
+"generator"
+"generic"
+16671352
+"generator"
+"generic"
+16672235
+"generator"
+"generic"
+16672237
+"generator"
+"generic"
+```
+
+Use the package ID to inspect the files and their size in the package.
+
+```shell
+glab api --method GET projects/$GL_PROJECT_ID/packages/16669383/package_files | jq --compact-output '.[]' |
+ jq --compact-output '.package_id,.file_name,.size'
+
+16669383
+"nighly.tar.gz"
+10487563
+```
+
+A similar automation shell script is created in the [delete old pipelines](#delete-old-pipelines) section.
+
+The following script example uses the `python-gitlab` library to fetch all packages in a loop,
+and loops over its package files to print the `file_name` and `size` attributes.
+
+```python
+ packages = project.packages.list(order_by="created_at")
+
+ for package in packages:
+
+ package_files = package.package_files.list()
+ for package_file in package_files:
+ print("Package name: {p} File name: {f} Size {s}".format(
+ p=package.name, f=package_file.file_name, s=render_size_mb(package_file.size)))
+```
+
+### Delete packages
+
+[Deleting a file in a package](../api/packages.md#delete-a-package-file) can corrupt the package. You should delete the package when performing automated cleanup maintenance.
+
+To delete a package, use the [GitLab CLI](../editor_extensions/gitlab_cli/index.md) to change the `--method`
+parameter to `DELETE`:
+
+```shell
+glab api --method DELETE projects/$GL_PROJECT_ID/packages/16669383
+```
+
+To calculate the package size and compare it against a size threshold, you can use the `python-gitlab` library
+to extend the code described in the [list packages and files](#list-packages-and-files) section.
+
+The following code example also calculates the package age and deletes the package when the conditions match:
+
+```python
+ packages = project.packages.list(order_by="created_at")
+ for package in packages:
+ package_size = 0.0
+
+ package_files = package.package_files.list()
+ for package_file in package_files:
+ print("Package name: {p} File name: {f} Size {s}".format(
+ p=package.name, f=package_file.file_name, s=render_size_mb(package_file.size)))
+
+ package_size =+ package_file.size
+
+ print("Package size: {s}\n\n".format(s=render_size_mb(package_size)))
+
+ threshold_size = 10 * 1024 * 1024
+
+ if (package_size > float(threshold_size)):
+ print("Package size {s} > threshold {t}, deleting package.".format(
+ s=render_size_mb(package_size), t=render_size_mb(threshold_size)))
+ package.delete()
+
+ threshold_age = 90 * 24 * 60 * 60
+ package_age = created_at = calculate_age(package.created_at)
+
+ if (float(package_age > float(threshold_age))):
+ print("Package age {a} > threshold {t}, deleting package.".format(
+ a=render_age_time(package_age), t=render_age_time(threshold_age)))
+ package.delete()
+```
+
+The code generates the following output that you can use for further analysis:
+
+```shell
+Package name: generator File name: nighly.tar.gz Size 10.0017
+Package size: 10.0017
+Package size 10.0017 > threshold 10.0000, deleting package.
+
+Package name: generator File name: 1-nightly.tar.gz Size 1.0004
+Package size: 1.0004
+
+Package name: generator File name: 10-nightly.tar.gz Size 10.0018
+Package name: generator File name: 20-nightly.tar.gz Size 20.0033
+Package size: 20.0033
+Package size 20.0033 > threshold 10.0000, deleting package.
+```
+
+The full example of the script `get_all_projects_top_level_namespace_storage_analysis_cleanup_example.py` is located in the [GitLab API with Python](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-api-python/) project.
+
+### Dependency Proxy
+
+Review the [cleanup policy](packages/dependency_proxy/reduce_dependency_proxy_storage.md#cleanup-policies) and how to [purge the cache using the API](packages/dependency_proxy/reduce_dependency_proxy_storage.md#use-the-api-to-clear-the-cache)
+
+## Community resources
+
+These resources are not officially supported. Ensure to test scripts and tutorials before running destructive cleanup commands that may not be reverted.
+
+- Forum topic: [Storage management automation resources](https://forum.gitlab.com/t/storage-management-automation-resources/)
+- Script: [GitLab Storage Analyzer](https://gitlab.com/gitlab-de/use-cases/gitlab-api/gitlab-storage-analyzer), unofficial project by the [GitLab Developer Evangelism team](https://gitlab.com/gitlab-de/). You find similar code examples in this documentation how-to here.
+
+## Testing for storage management automation
+
+To test storage management automation, you might need to generate test data, or populate
+storage to verify that the analysis and deletion works as expected. The following sections
+provide tools and tips about testing and generating storage blobs in a short amount of time.
+
+### Generate job artifacts
+
+Create a test project to generate fake artifact blobs using CI/CD job matrix builds. Add a CI/CD pipeline to generate artifacts on a daily basis
+
+1. Create a new project.
+1. Add the following snippet to `.gitlab-ci.yml` to include the job artifact generator configuration.
+
+ ```yaml
+ include:
+ - remote: https://gitlab.com/gitlab-de/use-cases/efficiency/job-artifact-generator/-/raw/main/.gitlab-ci.yml
+ ```
+
+1. [Configure pipeline schedules](../ci/pipelines/schedules.md#add-a-pipeline-schedule).
+1. [Trigger the pipeline manually](../ci/pipelines/schedules.md#run-manually).
+
+Alternatively, reduce the 86 MB daily generated MB to different values in the `MB_COUNT` variable.
+
+```yaml
+include:
+ - remote: https://gitlab.com/gitlab-de/use-cases/efficiency/job-artifact-generator/-/raw/main/.gitlab-ci.yml
+
+generator:
+ parallel:
+ matrix:
+ - MB_COUNT: [1, 5, 10, 20, 50]
+
+```
+
+For more information, see the [Job Artifact Generator README](https://gitlab.com/gitlab-de/use-cases/efficiency/job-artifact-generator), with an [example group](https://gitlab.com/gitlab-de/playground/artifact-gen-group).
+
+### Generate job artifacts with expiry
+
+The project CI/CD configuration specifies job definitions in:
+
+- The main `.gitlab-ci.yml` configuration file.
+- The `artifacts:expire_in` setting.
+- Project files and templates.
+
+To test the analysis scripts, the [`gen-job-artifacts-expiry-included-jobs`](https://gitlab.com/gitlab-de/playground/artifact-gen-group/gen-job-artifacts-expiry-included-jobs) project provides an example configuration.
+
+```yaml
+# .gitlab-ci.yml
+include:
+ - include_jobs.yml
+
+default:
+ artifacts:
+ paths:
+ - '*.txt'
+
+.gen-tmpl:
+ script:
+ - dd if=/dev/urandom of=${$MB_COUNT}.txt bs=1048576 count=${$MB_COUNT}
+
+generator:
+ extends: [.gen-tmpl]
+ parallel:
+ matrix:
+ - MB_COUNT: [1, 5, 10, 20, 50]
+ artifacts:
+ untracked: false
+ when: on_success
+ expire_in: 30 days
+
+# include_jobs.yml
+.includeme:
+ script:
+ - dd if=/dev/urandom of=1.txt bs=1048576 count=1
+
+included-job10:
+ script:
+ - echo "Servus"
+ - !reference [.includeme, script]
+ artifacts:
+ untracked: false
+ when: on_success
+ expire_in: 10 days
+
+included-job1:
+ script:
+ - echo "Gruezi"
+ - !reference [.includeme, script]
+ artifacts:
+ untracked: false
+ when: on_success
+ expire_in: 1 days
+
+included-job30:
+ script:
+ - echo "Grias di"
+ - !reference [.includeme, script]
+ artifacts:
+ untracked: false
+ when: on_success
+ expire_in: 30 days
+```
+
+### Generate container images
+
+The example group [`container-package-gen-group`](https://gitlab.com/gitlab-de/playground/container-package-gen-group) provides projects that:
+
+- Use a base image in Dockerfile to build a new image.
+- Include the `Docker.gitlab-ci.yml` template to build images on GitLab.com SaaS.
+- Configure pipeline schedules to generate new images daily.
+
+Example projects available to fork:
+
+- [`docker-alpine-generator`](https://gitlab.com/gitlab-de/playground/container-package-gen-group/docker-alpine-generator)
+- [`docker-python-generator`](https://gitlab.com/gitlab-de/playground/container-package-gen-group/docker-python-generator)
+
+### Generate generic packages
+
+The example project [`generic-package-generator`](https://gitlab.com/gitlab-de/playground/container-package-gen-group/generic-package-generator) provides projects that:
+
+- Generate a random text blob, and create a tarball with the current Unix timestamp as release version.
+- Upload the tarball into the generic package registry, using the Unix timestamp as release version.
+
+To generate generic packages, you can use this standalone `.gitlab-ci.yml` configuration:
+
+```yaml
+generate-package:
+ parallel:
+ matrix:
+ - MB_COUNT: [1, 5, 10, 20]
+ before_script:
+ - apt update && apt -y install curl
+ script:
+ - dd if=/dev/urandom of="${MB_COUNT}.txt" bs=1048576 count=${MB_COUNT}
+ - tar czf "generated-$MB_COUNT-nighly-`date +%s`.tar.gz" "${MB_COUNT}.txt"
+ - 'curl --header "JOB-TOKEN: $CI_JOB_TOKEN" --upload-file "generated-$MB_COUNT-nighly-`date +%s`.tar.gz" "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/generic/generator/`date +%s`/${MB_COUNT}-nightly.tar.gz"'
+
+ artifacts:
+ paths:
+ - '*.tar.gz'
+
+```
+
+### Generate storage usage with forks
+
+Use the following projects to test storage usage with [cost factors for forks](usage_quotas.md#view-project-fork-storage-usage):
+
+- Fork [`gitlab-org/gitlab`](https://gitlab.com/gitlab-org/gitlab) into a new namespace or group (includes LFS, Git repository).
+- Fork [`gitlab-com/www-gitlab-com`](https://gitlab.com/gitlab-com/www-gitlab-comgitlab-com/www-gitlab-com) into a new namespace or group.