diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2021-12-20 16:37:47 +0300 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2021-12-20 16:37:47 +0300 |
commit | aee0a117a889461ce8ced6fcf73207fe017f1d99 (patch) | |
tree | 891d9ef189227a8445d83f35c1b0fc99573f4380 /doc/development/database | |
parent | 8d46af3258650d305f53b819eabf7ab18d22f59e (diff) |
Add latest changes from gitlab-org/gitlab@14-6-stable-eev14.6.0-rc42
Diffstat (limited to 'doc/development/database')
-rw-r--r-- | doc/development/database/loose_foreign_keys.md | 139 | ||||
-rw-r--r-- | doc/development/database/multiple_databases.md | 106 |
2 files changed, 224 insertions, 21 deletions
diff --git a/doc/development/database/loose_foreign_keys.md b/doc/development/database/loose_foreign_keys.md index 157c1284512..c60989f225d 100644 --- a/doc/development/database/loose_foreign_keys.md +++ b/doc/development/database/loose_foreign_keys.md @@ -52,25 +52,40 @@ For this procedure to work, we must register which tables to clean up asynchrono ## Example migration and configuration -### Configure the model +### Configure the loose foreign key -First, tell the application that the `projects` table has a new loose foreign key. -You can do this in the `Project` model: +Loose foreign keys are defined in a YAML file. The configuration requires the +following information: -```ruby -class Project < ApplicationRecord - # ... +- Parent table name (`projects`) +- Child table name (`ci_pipelines`) +- The data cleanup method (`async_delete` or `async_nullify`) - include LooseForeignKey +The YAML file is located at `lib/gitlab/database/gitlab_loose_foreign_keys.yml`. The file groups +foreign key definitions by the name of the child table. The child table can have multiple loose +foreign key definitions, therefore we store them as an array. - loose_foreign_key :ci_pipelines, :project_id, on_delete: :async_delete # or async_nullify +Example definition: - # ... -end +```yaml +ci_pipelines: + - table: projects + column: project_id + on_delete: async_delete ``` -This instruction ensures the asynchronous cleanup process knows about the association, and the -how to do the cleanup. In this case, the associated `ci_pipelines` records are deleted. +If the `ci_pipelines` key is already present in the YAML file, then a new entry can be added +to the array: + +```yaml +ci_pipelines: + - table: projects + column: project_id + on_delete: async_delete + - table: another_table + column: another_id + on_delete: :async_nullify +``` ### Track record changes @@ -127,6 +142,19 @@ end At this point, the setup phase is concluded. The deleted `projects` records should be automatically picked up by the scheduled cleanup worker job. +## Testing + +The "`it has loose foreign keys`" shared example can be used to test the presence of the `ON DELETE` trigger and the +loose foreign key definitions. + +Simply add to the model test file: + +```ruby +it_behaves_like 'it has loose foreign keys' do + let(:factory_name) { :project } +end +``` + ## Caveats of loose foreign keys ### Record creation @@ -180,3 +208,90 @@ end NOTE: This example is unlikely in GitLab, because we usually look up the parent models to perform permission checks. + +## A note on `dependent: :destroy` and `dependent: :nullify` + +We considered using these Rails features as an alternative to foreign keys but there are several problems which include: + +1. These run on a different connection in the context of a transaction [which we do not allow](multiple_databases.md#removing-cross-database-transactions). +1. These can lead to severe performance degradation as we load all records from PostgreSQL, loop over them in Ruby, and call individual `DELETE` queries. +1. These can miss data as they only cover the case when the `destroy` method is called directly on the model. There are other cases including `delete_all` and cascading deletes from another parent table that could mean these are missed. + +## Risks of loose foreign keys and possible mitigations + +In general, the loose foreign keys architecture is eventually consistent and +the cleanup latency might lead to problems visible to GitLab users or +operators. We consider the tradeoff as acceptable, but there might be +cases where the problems are too frequent or too severe, and we must +implement a mitigation strategy. A general mitigation strategy might be to have +an "urgent" queue for cleanup of records that have higher impact with a delayed +cleanup. + +Below are some more specific examples of problems that might occur and how we +might mitigate them. In all the listed cases we might still consider the problem +described to be low risk and low impact, and in that case we would choose to not +implement any mitigation. + +### The record should be deleted but it shows up in a view + +This hypothetical example might happen with a foreign key like: + +```sql +ALTER TABLE ONLY vulnerability_occurrence_pipelines + ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE; +``` + +In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records +whenever we delete the `ci_pipelines` record associated with them. In this case +you might end up with some vulnerability page in GitLab which shows an occurrence +of a vulnerability. However, when you try to click a link to the pipeline, you get +a 404, because the pipeline is deleted. Then, when you navigate back you might find the +occurrence has disappeared too. + +**Mitigation** + +When rendering the vulnerability occurrences on the vulnerability page we could +try to load the corresponding pipeline and choose to skip displaying that +occurrence if pipeline is not found. + +### The deleted parent record is needed to render a view and causes a `500` error + +This hypothetical example might happen with a foreign key like: + +```sql +ALTER TABLE ONLY vulnerability_occurrence_pipelines + ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE; +``` + +In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records +whenever we delete the `ci_pipelines` record associated with them. In this case +you might end up with a vulnerability page in GitLab which shows an "occurrence" +of a vulnerability. However, when rendering the occurrence we try to load, for example, +`occurrence.pipeline.created_at`, which causes a 500 for the user. + +**Mitigation** + +When rendering the vulnerability occurrences on the vulnerability page we could +try to load the corresponding pipeline and choose to skip displaying that +occurrence if pipeline is not found. + +### The deleted parent record is accessed in a Sidekiq worker and causes a failed job + +This hypothetical example might happen with a foreign key like: + +```sql +ALTER TABLE ONLY vulnerability_occurrence_pipelines + ADD CONSTRAINT fk_rails_6421e35d7d FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE CASCADE; +``` + +In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records +whenever we delete the `ci_pipelines` record associated with them. In this case +you might end up with a Sidekiq worker that is responsible for processing a +vulnerability and looping over all occurrences causing a Sidekiq job to fail if +it executes `occurrence.pipeline.created_at`. + +**Mitigation** + +When looping through the vulnerability occurrences in the Sidekiq worker, we +could try to load the corresponding pipeline and choose to skip processing that +occurrence if pipeline is not found. diff --git a/doc/development/database/multiple_databases.md b/doc/development/database/multiple_databases.md index a17ad798305..1338e83070f 100644 --- a/doc/development/database/multiple_databases.md +++ b/doc/development/database/multiple_databases.md @@ -15,10 +15,46 @@ To scale GitLab, the we are database for CI/CD tables was [introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64289) in GitLab 14.1. This feature is still under development, and is not ready for production use. +### Development setup + By default, GitLab is configured to use only one main database. To opt-in to use a main database, and CI database, modify the `config/database.yml` file to have a `main` and a `ci` database -configurations. For example, given a `config/database.yml` like below: +configurations. + +You can set this up using [GDK](#gdk-configuration) or by +[manually configuring `config/database.yml`](#manually-set-up-the-cicd-database). + +#### GDK configuration + +If you are using GDK, you can follow the following steps: + +1. On the GDK root directory, run: + + ```shell + gdk config set gitlab.rails.multiple_databases true + ``` + +1. Open your `gdk.yml`, and confirm that it has the following lines: + + ```yaml + gitlab: + rails: + multiple_databases: true + ``` + +1. Reconfigure GDK: + + ```shell + gdk reconfigure + ``` + +1. [Create the new CI/CD database](#create-the-new-database). + +#### Manually set up the CI/CD database + +You can manually edit `config/database.yml` to split the databases. +To do so, consider a `config/database.yml` file like the example below: ```yaml development: @@ -44,7 +80,7 @@ test: &test statement_timeout: 120s ``` -Edit the `config/database.yml` to look like this: +Edit it to split the databases into `main` and `ci`: ```yaml development: @@ -88,6 +124,25 @@ test: &test statement_timeout: 120s ``` +Next, [create the new CI/CD database](#create-the-new-database). + +#### Create the new database + +After configuring GitLab for the two databases, create the new CI/CD database: + +1. Create the new `ci:` database, load the DB schema into the `ci:` database, + and run any pending migrations: + + ```shell + bundle exec rails db:create db:schema:load:ci db:migrate + ``` + +1. Restart GDK: + + ```shell + gdk restart + ``` + <!-- NOTE: The `validate_cross_joins!` method in `spec/support/database/prevent_cross_joins.rb` references the following heading in the code, so if you make a change to this heading, make sure to update @@ -557,11 +612,31 @@ Don't hesitate to reach out to the [sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding/) for advice. +##### Avoid `dependent: :nullify` and `dependent: :destroy` across databases + +There may be cases where we want to use `dependent: :nullify` or `dependent: :destroy` +across databases. This is technically possible, but it's problematic because +these hooks run in the context of an outer transaction from the call to +`#destroy`, which creates a cross-database transaction and we are trying to +avoid that. Cross-database transactions caused this way could lead to confusing +outcomes when we switch to decomposed, because now you have some queries +happening outside the transaction and they may be partially applied while the +outer transaction fails, which could lead to surprising bugs. + +If you need to do some cleanup after a `destroy` you will need to choose +from some of the options above. If all you need to do is cleanup the child +records themselves from PostgreSQL then you could consider using ["loose foreign +keys"](loose_foreign_keys.md). + ## `config/database.yml` -GitLab will support running multiple databases in the future, for example to [separate tables for the continuous integration features](https://gitlab.com/groups/gitlab-org/-/epics/6167) from the main database. In order to prepare for this change, we [validate the structure of the configuration](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67877) in `database.yml` to ensure that only known databases are used. +GitLab is adding support to run multiple databases, for example to +[separate tables for the continuous integration features](https://gitlab.com/groups/gitlab-org/-/epics/6167) +from the main database. In order to prepare for this change, we +[validate the structure of the configuration](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67877) +in `database.yml` to ensure that only known databases are used. -Previously, the `config/database.yml` would look like this: +Previously, the `config/database.yml` looked like this: ```yaml production: @@ -571,15 +646,16 @@ production: ... ``` -With the support for many databases the support for this -syntax is deprecated and will be removed in [15.0](https://gitlab.com/gitlab-org/gitlab/-/issues/338182). +With the support for many databases this +syntax is [deprecated](https://gitlab.com/gitlab-org/gitlab/-/issues/338182) +and will be removed in [15.0](https://gitlab.com/gitlab-org/gitlab/-/issues/338182). The new `config/database.yml` needs to include a database name to define a database configuration. Only `main:` and `ci:` database -names are supported today. The `main:` needs to always be a first +names are supported. The `main:` database must always be a first entry in a hash. This change applies to decomposed and non-decomposed -change. If an invalidate or deprecated syntax is used the error -or warning will be printed during application start. +change. If an invalid or deprecated syntax is used the error +or warning is printed during application start. ```yaml # Non-decomposed database @@ -603,3 +679,15 @@ production: database: gitlabhq_production_ci ... ``` + +## Foreign keys that cross databases + +There are many places where we use foreign keys that reference across the two +databases. This is not possible to do with two separate PostgreSQL +databases, so we need to replicate the behavior we get from PostgreSQL in a +performant way. We can't, and shouldn't, try to replicate the data guarantees +given by PostgreSQL which prevent creating invalid references, but we still need a +way to replace cascading deletes so we don't end up with orphaned data +or records that point to nowhere, which might lead to bugs. As such we created +["loose foreign keys"](loose_foreign_keys.md) which is an asynchronous +process of cleaning up orphaned records. |