Add latest changes from gitlab-org/gitlab@14-6-stable-eev14.6.0-rc42

author: GitLab Bot <gitlab-bot@gitlab.com> 2021-12-20 16:37:47 +0300
committer: GitLab Bot <gitlab-bot@gitlab.com> 2021-12-20 16:37:47 +0300
commit: aee0a117a889461ce8ced6fcf73207fe017f1d99 (patch)
tree: 891d9ef189227a8445d83f35c1b0fc99573f4380 /doc/development/database
parent: 8d46af3258650d305f53b819eabf7ab18d22f59e (diff)
2 files changed, 224 insertions, 21 deletions
diff --git a/doc/development/database/loose_foreign_keys.md b/doc/development/database/loose_foreign_keys.md
index 157c1284512..c60989f225d 100644
--- a/doc/development/database/loose_foreign_keys.md
+++ b/doc/development/database/loose_foreign_keys.md
@@ -52,25 +52,40 @@ For this procedure to work, we must register which tables to clean up asynchrono
 
 ## Example migration and configuration
 
-### Configure the model
+### Configure the loose foreign key
 
-First, tell the application that the `projects` table has a new loose foreign key.
-You can do this in the `Project` model:
+Loose foreign keys are defined in a YAML file. The configuration requires the
+following information:
 
-```ruby
-class Project < ApplicationRecord
-  # ...
+- Parent table name (`projects`)
+- Child table name (`ci_pipelines`)
+- The data cleanup method (`async_delete` or `async_nullify`)
 
-  include LooseForeignKey
+The YAML file is located at `lib/gitlab/database/gitlab_loose_foreign_keys.yml`. The file groups
+foreign key definitions by the name of the child table. The child table can have multiple loose
+foreign key definitions, therefore we store them as an array.
 
-  loose_foreign_key :ci_pipelines, :project_id, on_delete: :async_delete # or async_nullify
+Example definition:
 
-  # ...
-end
+```yaml
+ci_pipelines:
+  - table: projects
+    column: project_id
+    on_delete: async_delete
 ```
 
-This instruction ensures the asynchronous cleanup process knows about the association, and the
-how to do the cleanup. In this case, the associated `ci_pipelines` records are deleted.
+If the `ci_pipelines` key is already present in the YAML file, then a new entry can be added
+to the array:
+
+```yaml
+ci_pipelines:
+  - table: projects
+    column: project_id
+    on_delete: async_delete
+  - table: another_table
+    column: another_id
+    on_delete: :async_nullify
+```
 
 ### Track record changes
 
@@ -127,6 +142,19 @@ end
 At this point, the setup phase is concluded. The deleted `projects` records should be automatically
 picked up by the scheduled cleanup worker job.
 
+## Testing
+
+The "`it has loose foreign keys`" shared example can be used to test the presence of the `ON DELETE` trigger and the
+loose foreign key definitions.
+
+Simply add to the model test file:
+
+```ruby
+it_behaves_like 'it has loose foreign keys' do
+  let(:factory_name) { :project }
+end
+```
+
 ## Caveats of loose foreign keys
 
 ### Record creation
@@ -180,3 +208,90 @@ end
 NOTE:
 This example is unlikely in GitLab, because we usually look up the parent models to perform
 permission checks.
+
+## A note on `dependent: :destroy` and `dependent: :nullify`
+
+We considered using these Rails features as an alternative to foreign keys but there are several problems which include:
+
+1. These run on a different connection in the context of a transaction [which we do not allow](multiple_databases.md#removing-cross-database-transactions).
+1. These can lead to severe performance degradation as we load all records from PostgreSQL, loop over them in Ruby, and call individual `DELETE` queries.
+1. These can miss data as they only cover the case when the `destroy` method is called directly on the model. There are other cases including `delete_all` and cascading deletes from another parent table that could mean these are missed.
+
+## Risks of loose foreign keys and possible mitigations
+
+In general, the loose foreign keys architecture is eventually consistent and
+the cleanup latency might lead to problems visible to GitLab users or
+operators. We consider the tradeoff as acceptable, but there might be
+cases where the problems are too frequent or too severe, and we must
+implement a mitigation strategy. A general mitigation strategy might be to have
+an "urgent" queue for cleanup of records that have higher impact with a delayed
+cleanup.
+
+Below are some more specific examples of problems that might occur and how we
+might mitigate them. In all the listed cases we might still consider the problem
+described to be low risk and low impact, and in that case we would choose to not
+implement any mitigation.
+
+### The record should be deleted but it shows up in a view
+
+This hypothetical example might happen with a foreign key like:
+
+```sql
+ALTER TABLE ONLY vulnerability_occurrence_pipelines
+    ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE;
+```
+
+In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records
+whenever we delete the `ci_pipelines` record associated with them. In this case
+you might end up with some vulnerability page in GitLab which shows an occurrence
+of a vulnerability. However, when you try to click a link to the pipeline, you get
+a 404, because the pipeline is deleted. Then, when you navigate back you might find the
+occurrence has disappeared too.
+
+**Mitigation**
+
+When rendering the vulnerability occurrences on the vulnerability page we could
+try to load the corresponding pipeline and choose to skip displaying that
+occurrence if pipeline is not found.
+
+### The deleted parent record is needed to render a view and causes a `500` error
+
+This hypothetical example might happen with a foreign key like:
+
+```sql
+ALTER TABLE ONLY vulnerability_occurrence_pipelines
+    ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE;
+```
+
+In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records
+whenever we delete the `ci_pipelines` record associated with them. In this case
+you might end up with a vulnerability page in GitLab which shows an "occurrence"
+of a vulnerability. However, when rendering the occurrence we try to load, for example,
+`occurrence.pipeline.created_at`, which causes a 500 for the user.
+
+**Mitigation**
+
+When rendering the vulnerability occurrences on the vulnerability page we could
+try to load the corresponding pipeline and choose to skip displaying that
+occurrence if pipeline is not found.
+
+### The deleted parent record is accessed in a Sidekiq worker and causes a failed job
+
+This hypothetical example might happen with a foreign key like:
+
+```sql
+ALTER TABLE ONLY vulnerability_occurrence_pipelines
+    ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE;
+```
+
+In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records
+whenever we delete the `ci_pipelines` record associated with them. In this case
+you might end up with a Sidekiq worker that is responsible for processing a
+vulnerability and looping over all occurrences causing a Sidekiq job to fail if
+it executes `occurrence.pipeline.created_at`.
+
+**Mitigation**
+
+When looping through the vulnerability occurrences in the Sidekiq worker, we
+could try to load the corresponding pipeline and choose to skip processing that
+occurrence if pipeline is not found.
diff --git a/doc/development/database/multiple_databases.md b/doc/development/database/multiple_databases.md
index a17ad798305..1338e83070f 100644
--- a/doc/development/database/multiple_databases.md
+++ b/doc/development/database/multiple_databases.md
@@ -15,10 +15,46 @@ To scale GitLab, the we are
 database for CI/CD tables was [introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64289)
 in GitLab 14.1. This feature is still under development, and is not ready for production use.
 
+### Development setup
+
 By default, GitLab is configured to use only one main database. To
 opt-in to use a main database, and CI database, modify the
 `config/database.yml` file to have a `main` and a `ci` database
-configurations. For example, given a `config/database.yml` like below:
+configurations.
+
+You can set this up using [GDK](#gdk-configuration) or by
+[manually configuring `config/database.yml`](#manually-set-up-the-cicd-database).
+
+#### GDK configuration
+
+If you are using GDK, you can follow the following steps:
+
+1. On the GDK root directory, run:
+
+   ```shell
+   gdk config set gitlab.rails.multiple_databases true
+   ```
+
+1. Open your `gdk.yml`, and confirm that it has the following lines:
+
+   ```yaml
+   gitlab:
+     rails:
+       multiple_databases: true
+   ```
+
+1. Reconfigure GDK:
+
+   ```shell
+   gdk reconfigure
+   ```
+
+1. [Create the new CI/CD database](#create-the-new-database).
+
+#### Manually set up the CI/CD database
+
+You can manually edit `config/database.yml` to split the databases.
+To do so, consider a `config/database.yml` file like the example below:
 
 ```yaml
 development:
@@ -44,7 +80,7 @@ test: &test
       statement_timeout: 120s
 ```
 
-Edit the `config/database.yml` to look like this:
+Edit it to split the databases into `main` and `ci`:
 
 ```yaml
 development:
@@ -88,6 +124,25 @@ test: &test
       statement_timeout: 120s
 ```
 
+Next, [create the new CI/CD database](#create-the-new-database).
+
+#### Create the new database
+
+After configuring GitLab for the two databases, create the new CI/CD database:
+
+1. Create the new `ci:` database, load the DB schema into the `ci:` database,
+   and run any pending migrations:
+
+    ```shell
+    bundle exec rails db:create db:schema:load:ci db:migrate
+    ```
+
+1. Restart GDK:
+
+    ```shell
+    gdk restart
+    ```
+
 <!--
 NOTE: The `validate_cross_joins!` method in `spec/support/database/prevent_cross_joins.rb` references
       the following heading in the code, so if you make a change to this heading, make sure to update
@@ -557,11 +612,31 @@ Don't hesitate to reach out to the
 [sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding/)
 for advice.
 
+##### Avoid `dependent: :nullify` and `dependent: :destroy` across databases
+
+There may be cases where we want to use `dependent: :nullify` or `dependent: :destroy`
+across databases. This is technically possible, but it's problematic because
+these hooks run in the context of an outer transaction from the call to
+`#destroy`, which creates a cross-database transaction and we are trying to
+avoid that. Cross-database transactions caused this way could lead to confusing
+outcomes when we switch to decomposed, because now you have some queries
+happening outside the transaction and they may be partially applied while the
+outer transaction fails, which could lead to surprising bugs.
+
+If you need to do some cleanup after a `destroy` you will need to choose
+from some of the options above. If all you need to do is cleanup the child
+records themselves from PostgreSQL then you could consider using ["loose foreign
+keys"](loose_foreign_keys.md).
+
 ## `config/database.yml`
 
-GitLab will support running multiple databases in the future, for example to [separate tables for the continuous integration features](https://gitlab.com/groups/gitlab-org/-/epics/6167) from the main database. In order to prepare for this change, we [validate the structure of the configuration](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67877) in `database.yml` to ensure that only known databases are used.
+GitLab is adding support to run multiple databases, for example to
+[separate tables for the continuous integration features](https://gitlab.com/groups/gitlab-org/-/epics/6167)
+from the main database. In order to prepare for this change, we
+[validate the structure of the configuration](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67877)
+in `database.yml` to ensure that only known databases are used.
 
-Previously, the `config/database.yml` would look like this:
+Previously, the `config/database.yml` looked like this:
 
 ```yaml
 production:
@@ -571,15 +646,16 @@ production:
   ...
 ```
 
-With the support for many databases the support for this
-syntax is deprecated and will be removed in [15.0](https://gitlab.com/gitlab-org/gitlab/-/issues/338182).
+With the support for many databases this
+syntax is [deprecated](https://gitlab.com/gitlab-org/gitlab/-/issues/338182)
+and will be removed in [15.0](https://gitlab.com/gitlab-org/gitlab/-/issues/338182).
 
 The new `config/database.yml` needs to include a database name
 to define a database configuration. Only `main:` and `ci:` database
-names are supported today. The `main:` needs to always be a first
+names are supported. The `main:` database must always be a first
 entry in a hash. This change applies to decomposed and non-decomposed
-change. If an invalidate or deprecated syntax is used the error
-or warning will be printed during application start.
+change. If an invalid or deprecated syntax is used the error
+or warning is printed during application start.
 
 ```yaml
 # Non-decomposed database
@@ -603,3 +679,15 @@ production:
     database: gitlabhq_production_ci
     ...
 ```
+
+## Foreign keys that cross databases
+
+There are many places where we use foreign keys that reference across the two
+databases. This is not possible to do with two separate PostgreSQL
+databases, so we need to replicate the behavior we get from PostgreSQL in a
+performant way. We can't, and shouldn't, try to replicate the data guarantees
+given by PostgreSQL which prevent creating invalid references, but we still need a
+way to replace cascading deletes so we don't end up with orphaned data
+or records that point to nowhere, which might lead to bugs. As such we created
+["loose foreign keys"](loose_foreign_keys.md) which is an asynchronous
+process of cleaning up orphaned records.
author	GitLab Bot <gitlab-bot@gitlab.com>	2021-12-20 16:37:47 +0300
committer	GitLab Bot <gitlab-bot@gitlab.com>	2021-12-20 16:37:47 +0300
commit	aee0a117a889461ce8ced6fcf73207fe017f1d99 (patch)
tree	891d9ef189227a8445d83f35c1b0fc99573f4380 /doc/development/database
parent	8d46af3258650d305f53b819eabf7ab18d22f59e (diff)