diff options
Diffstat (limited to 'doc/topics/offline/quick_start_guide.md')
-rw-r--r-- | doc/topics/offline/quick_start_guide.md | 85 |
1 files changed, 68 insertions, 17 deletions
diff --git a/doc/topics/offline/quick_start_guide.md b/doc/topics/offline/quick_start_guide.md index dd739fdaf77..82a88b53dcf 100644 --- a/doc/topics/offline/quick_start_guide.md +++ b/doc/topics/offline/quick_start_guide.md @@ -215,15 +215,22 @@ On offline instances, the [GitLab Geo check Rake task](../../administration/geo/ always fails because it uses `pool.ntp.org`. This error can be ignored but you can [read more about how to work around it](../../administration/geo/replication/troubleshooting.md#message-machine-clock-is-synchronized--exception). -## Enabling the package metadata database +## Enabling the Package Metadata Database -Enabling the package metadata database is required to enable [license scanning of CycloneDX files](../../user/compliance/license_scanning_of_cyclonedx_files). -This process requires usage of the GitLab License Database, which is licensed under the [EE License](https://storage.googleapis.com/prod-export-license-bucket-1a6c642fc4de57d4/v1/LICENSE). -Note the following in relation to use of the License Database: +Enabling the Package Metadata Database is required to enable [license scanning of CycloneDX files](../../user/compliance/license_scanning_of_cyclonedx_files). +This process requires the use of License and/or Advisory Data under what is collectively called the Package Metadata Database, which is licensed under the [EE License](https://storage.googleapis.com/prod-export-license-bucket-1a6c642fc4de57d4/LICENSE). +Note the following in relation to use of the Package Metadata Database: -- We may change or discontinue all or any part of the License Database, at any time and without notice, at our sole discretion. -- The License Database may contain links to third-party websites or resources. We provide these links only as a convenience and are not responsible for any third-party data, content, products, or services from those websites or resources or links displayed on such websites. -- The License Database is based in part on information made available by third parties, and GitLab is not responsible for the accuracy or completeness of content made available. +- We may change or discontinue all or any part of the Package Metadata Database, at any time and without notice, at our sole discretion. +- The Package Metadata Database may contain links to third-party websites or resources. We provide these links only as a convenience and are not responsible for any third-party data, content, products, or services from those websites or resources or links displayed on such websites. +- The Package Metadata Database is based in part on information made available by third parties, and GitLab is not responsible for the accuracy or completeness of content made available. + +Enabling the Package Metadata Database is also required to enable Continuous Vulnerability Scans for Dependency Scanning (see [epic 9534](https://gitlab.com/groups/gitlab-org/-/epics/9534) tracking this work for more info). + +Package metadata is stored in the following Google Cloud Provider (GCP) buckets: + +- License Scanning - prod-export-license-bucket-1a6c642fc4de57d4 +- Dependency Scanning - prod-export-advisory-bucket-1a6c642fc4de57d4 ### Using the gsutil tool to download the package metadata exports @@ -235,34 +242,58 @@ Note the following in relation to use of the License Database: echo $GITLAB_RAILS_ROOT_DIR ``` +1. Set the type of data you wish to sync. + + ```shell + # For License Scanning + export PKG_METADATA_BUCKET=prod-export-license-bucket-1a6c642fc4de57d4 + export DATA_DIR="licenses" + + # For Dependency Scanning + export PKG_METADATA_BUCKET=prod-export-advisory-bucket-1a6c642fc4de57d4 + export DATA_DIR="advisories" + ``` + 1. Download the package metadata exports. ```shell # To download the package metadata exports, an outbound connection to Google Cloud Storage bucket must be allowed. - mkdir $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata_db/ - gsutil -m rsync -r -d gs://prod-export-license-bucket-1a6c642fc4de57d4 $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata_db/ + mkdir -p "$GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/$DATA_DIR" + gsutil -m rsync -r -d gs://$PKG_METADATA_BUCKET "$GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/$DATA_DIR" # Alternatively, if the GitLab instance is not allowed to connect to the Google Cloud Storage bucket, the package metadata # exports can be downloaded using a machine with the allowed access, and then copied to the root of the GitLab Rails directory. - rsync rsync://example_username@gitlab.example.com/package_metadata_db $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata_db/ + rsync rsync://example_username@gitlab.example.com/package_metadata/$DATA_DIR "$GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/$DATA_DIR" ``` ### Using the Google Cloud Storage REST API to download the package metadata exports -The package metadata exports can also be downloaded using the Google Cloud Storage API. The contents are available at [https://storage.googleapis.com/storage/v1/b/prod-export-license-bucket-1a6c642fc4de57d4/o](https://storage.googleapis.com/storage/v1/b/prod-export-license-bucket-1a6c642fc4de57d4/o). The following is an example of how this can be downloaded using [cURL](https://curl.se/) and [jq](https://stedolan.github.io/jq/). +The package metadata exports can also be downloaded using the Google Cloud Storage API. The contents are available at [https://storage.googleapis.com/storage/v1/b/prod-export-license-bucket-1a6c642fc4de57d4/o](https://storage.googleapis.com/storage/v1/b/prod-export-license-bucket-1a6c642fc4de57d4/o) and [https://storage.googleapis.com/storage/v1/b/prod-export-advisory-bucket-1a6c642fc4de57d4/o](https://storage.googleapis.com/storage/v1/b/prod-export-advisory-bucket-1a6c642fc4de57d4/o). The following is an example of how this can be downloaded using [cURL](https://curl.se/) and [jq](https://stedolan.github.io/jq/). ```shell #!/bin/bash set -euo pipefail +DATA_TYPE=$1 + GITLAB_RAILS_ROOT_DIR="$(gitlab-rails runner 'puts Rails.root.to_s')" -PKG_METADATA_DIR="$GITLAB_RAILS_ROOT_DIR/vendor/package_metadata_db" -PKG_METADATA_MANIFEST_OUTPUT_FILE="/tmp/license_db_export_manifest.json" -PKG_METADATA_DOWNLOADS_OUTPUT_FILE="/tmp/license_db_object_links.tsv" + +if [ "$DATA_TYPE" == "license" ]; then + PKG_METADATA_DIR="$GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/licenses" +elif [ "$DATA_TYPE" == "advisory" ]; then + PKG_METADATA_DIR="$GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/advisories" +else + echo "Usage: import_script.sh [licenses|advisories]" + exit 1 +fi + +PKG_METADATA_BUCKET="prod-export-$DATA_TYPE-bucket-1a6c642fc4de57d4" +PKG_METADATA_MANIFEST_OUTPUT_FILE="/tmp/package_metadata_${DATA_TYPE}_export_manifest.json" +PKG_METADATA_DOWNLOADS_OUTPUT_FILE="/tmp/package_metadata_${DATA_TYPE}_object_links.tsv" # Download the contents of the bucket -curl --silent --show-error --request GET "https://storage.googleapis.com/storage/v1/b/prod-export-license-bucket-1a6c642fc4de57d4/o?maxResults=7500" > "$PKG_METADATA_MANIFEST_OUTPUT_FILE" +curl --silent --show-error --request GET "https://storage.googleapis.com/storage/v1/b/$PKG_METADATA_BUCKET/o?maxResults=7500" > "$PKG_METADATA_MANIFEST_OUTPUT_FILE" # Parse the links and names for the bucket objects and output them into a tsv file jq -r '.items[] | [.name, .mediaLink] | @tsv' "$PKG_METADATA_MANIFEST_OUTPUT_FILE" > "$PKG_METADATA_DOWNLOADS_OUTPUT_FILE" @@ -294,9 +325,29 @@ echo "All objects saved to $PKG_METADATA_DIR" ### Automatic synchronization -Your GitLab instance is synchronized [every hour](https://gitlab.com/gitlab-org/gitlab/-/blob/d4331343d26d6e2a81fadd8f7ecd72f7cb74d04d/config/initializers/1_settings.rb#L831-832) with the contents of the `package_metadata_db` directory. +Your GitLab instance is synchronized [regularly](https://gitlab.com/gitlab-org/gitlab/-/blob/63a187d47f6da353ba4514650bbbbeb99c356325/config/initializers/1_settings.rb#L840-842) with the contents of the `package_metadata` directory. To automatically update your local copy with the upstream changes, a cron job can be added to periodically download new exports. For example, the following crontabs can be added to setup a cron job that runs every 30 minutes. +For License Scanning: + +```plaintext +*/30 * * * * gsutil -m rsync -r -d gs://prod-export-license-bucket-1a6c642fc4de57d4 $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/licenses +``` + +For Dependency Scanning: + ```plaintext -*/30 * * * * gsutil -m rsync -r -d gs://prod-export-license-bucket-1a6c642fc4de57d4 $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata_db/ +*/30 * * * * gsutil -m rsync -r -d gs://prod-export-advisory-bucket-1a6c642fc4de57d4 $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/advisories ``` + +### Change note + +The directory for package metadata changed with the release of 16.2 from `vendor/package_metadata_db` to `vendor/package_metadata/licenses`. If this directory already exists on the instance and Dependency Scanning needs to be added then you need to take the following steps. + +1. Rename the licenses directory: `mv vendor/package_metadata_db vendor/package_metadata/licenses`. +1. Update any automation scripts or commands saved to change `vendor/package_metadata_db` to `vendor/package_metadata/licenses`. +1. Update any cron entries to change `vendor/package_metadata_db` to `vendor/package_metadata/licenses`. + + ```shell + sed -i '.bckup' -e 's#vendor/package_metadata_db#vendor/package_metadata/licenses#g' [FILE ...] + ``` |