diff options
Diffstat (limited to 'doc/development/database/loose_foreign_keys.md')
-rw-r--r-- | doc/development/database/loose_foreign_keys.md | 139 |
1 files changed, 127 insertions, 12 deletions
diff --git a/doc/development/database/loose_foreign_keys.md b/doc/development/database/loose_foreign_keys.md index 157c1284512..c60989f225d 100644 --- a/doc/development/database/loose_foreign_keys.md +++ b/doc/development/database/loose_foreign_keys.md @@ -52,25 +52,40 @@ For this procedure to work, we must register which tables to clean up asynchrono ## Example migration and configuration -### Configure the model +### Configure the loose foreign key -First, tell the application that the `projects` table has a new loose foreign key. -You can do this in the `Project` model: +Loose foreign keys are defined in a YAML file. The configuration requires the +following information: -```ruby -class Project < ApplicationRecord - # ... +- Parent table name (`projects`) +- Child table name (`ci_pipelines`) +- The data cleanup method (`async_delete` or `async_nullify`) - include LooseForeignKey +The YAML file is located at `lib/gitlab/database/gitlab_loose_foreign_keys.yml`. The file groups +foreign key definitions by the name of the child table. The child table can have multiple loose +foreign key definitions, therefore we store them as an array. - loose_foreign_key :ci_pipelines, :project_id, on_delete: :async_delete # or async_nullify +Example definition: - # ... -end +```yaml +ci_pipelines: + - table: projects + column: project_id + on_delete: async_delete ``` -This instruction ensures the asynchronous cleanup process knows about the association, and the -how to do the cleanup. In this case, the associated `ci_pipelines` records are deleted. +If the `ci_pipelines` key is already present in the YAML file, then a new entry can be added +to the array: + +```yaml +ci_pipelines: + - table: projects + column: project_id + on_delete: async_delete + - table: another_table + column: another_id + on_delete: :async_nullify +``` ### Track record changes @@ -127,6 +142,19 @@ end At this point, the setup phase is concluded. The deleted `projects` records should be automatically picked up by the scheduled cleanup worker job. +## Testing + +The "`it has loose foreign keys`" shared example can be used to test the presence of the `ON DELETE` trigger and the +loose foreign key definitions. + +Simply add to the model test file: + +```ruby +it_behaves_like 'it has loose foreign keys' do + let(:factory_name) { :project } +end +``` + ## Caveats of loose foreign keys ### Record creation @@ -180,3 +208,90 @@ end NOTE: This example is unlikely in GitLab, because we usually look up the parent models to perform permission checks. + +## A note on `dependent: :destroy` and `dependent: :nullify` + +We considered using these Rails features as an alternative to foreign keys but there are several problems which include: + +1. These run on a different connection in the context of a transaction [which we do not allow](multiple_databases.md#removing-cross-database-transactions). +1. These can lead to severe performance degradation as we load all records from PostgreSQL, loop over them in Ruby, and call individual `DELETE` queries. +1. These can miss data as they only cover the case when the `destroy` method is called directly on the model. There are other cases including `delete_all` and cascading deletes from another parent table that could mean these are missed. + +## Risks of loose foreign keys and possible mitigations + +In general, the loose foreign keys architecture is eventually consistent and +the cleanup latency might lead to problems visible to GitLab users or +operators. We consider the tradeoff as acceptable, but there might be +cases where the problems are too frequent or too severe, and we must +implement a mitigation strategy. A general mitigation strategy might be to have +an "urgent" queue for cleanup of records that have higher impact with a delayed +cleanup. + +Below are some more specific examples of problems that might occur and how we +might mitigate them. In all the listed cases we might still consider the problem +described to be low risk and low impact, and in that case we would choose to not +implement any mitigation. + +### The record should be deleted but it shows up in a view + +This hypothetical example might happen with a foreign key like: + +```sql +ALTER TABLE ONLY vulnerability_occurrence_pipelines + ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE; +``` + +In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records +whenever we delete the `ci_pipelines` record associated with them. In this case +you might end up with some vulnerability page in GitLab which shows an occurrence +of a vulnerability. However, when you try to click a link to the pipeline, you get +a 404, because the pipeline is deleted. Then, when you navigate back you might find the +occurrence has disappeared too. + +**Mitigation** + +When rendering the vulnerability occurrences on the vulnerability page we could +try to load the corresponding pipeline and choose to skip displaying that +occurrence if pipeline is not found. + +### The deleted parent record is needed to render a view and causes a `500` error + +This hypothetical example might happen with a foreign key like: + +```sql +ALTER TABLE ONLY vulnerability_occurrence_pipelines + ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE; +``` + +In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records +whenever we delete the `ci_pipelines` record associated with them. In this case +you might end up with a vulnerability page in GitLab which shows an "occurrence" +of a vulnerability. However, when rendering the occurrence we try to load, for example, +`occurrence.pipeline.created_at`, which causes a 500 for the user. + +**Mitigation** + +When rendering the vulnerability occurrences on the vulnerability page we could +try to load the corresponding pipeline and choose to skip displaying that +occurrence if pipeline is not found. + +### The deleted parent record is accessed in a Sidekiq worker and causes a failed job + +This hypothetical example might happen with a foreign key like: + +```sql +ALTER TABLE ONLY vulnerability_occurrence_pipelines + ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE; +``` + +In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records +whenever we delete the `ci_pipelines` record associated with them. In this case +you might end up with a Sidekiq worker that is responsible for processing a +vulnerability and looping over all occurrences causing a Sidekiq job to fail if +it executes `occurrence.pipeline.created_at`. + +**Mitigation** + +When looping through the vulnerability occurrences in the Sidekiq worker, we +could try to load the corresponding pipeline and choose to skip processing that +occurrence if pipeline is not found. |