diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2022-08-18 11:17:02 +0300 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2022-08-18 11:17:02 +0300 |
commit | b39512ed755239198a9c294b6a45e65c05900235 (patch) | |
tree | d234a3efade1de67c46b9e5a38ce813627726aa7 /doc/development/database/foreign_keys.md | |
parent | d31474cf3b17ece37939d20082b07f6657cc79a9 (diff) |
Add latest changes from gitlab-org/gitlab@15-3-stable-eev15.3.0-rc42
Diffstat (limited to 'doc/development/database/foreign_keys.md')
-rw-r--r-- | doc/development/database/foreign_keys.md | 199 |
1 files changed, 199 insertions, 0 deletions
diff --git a/doc/development/database/foreign_keys.md b/doc/development/database/foreign_keys.md new file mode 100644 index 00000000000..7834e7d53c3 --- /dev/null +++ b/doc/development/database/foreign_keys.md @@ -0,0 +1,199 @@ +--- +stage: Data Stores +group: Database +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Foreign Keys & Associations + +When adding an association to a model you must also add a foreign key. For +example, say you have the following model: + +```ruby +class User < ActiveRecord::Base + has_many :posts +end +``` + +Add a foreign key here on column `posts.user_id`. This ensures +that data consistency is enforced on database level. Foreign keys also mean that +the database can very quickly remove associated data (for example, when removing a +user), instead of Rails having to do this. + +## Adding Foreign Keys In Migrations + +Foreign keys can be added concurrently using `add_concurrent_foreign_key` as +defined in `Gitlab::Database::MigrationHelpers`. See the +[Migration Style Guide](../migration_style_guide.md) for more information. + +Keep in mind that you can only safely add foreign keys to existing tables after +you have removed any orphaned rows. The method `add_concurrent_foreign_key` +does not take care of this so you must do so manually. See +[adding foreign key constraint to an existing column](add_foreign_key_to_existing_column.md). + +## Updating Foreign Keys In Migrations + +Sometimes a foreign key constraint must be changed, preserving the column +but updating the constraint condition. For example, moving from +`ON DELETE CASCADE` to `ON DELETE SET NULL` or vice-versa. + +PostgreSQL does not prevent you from adding overlapping foreign keys. It +honors the most recently added constraint. This allows us to replace foreign keys without +ever losing foreign key protection on a column. + +To replace a foreign key: + +1. [Add the new foreign key without validation](add_foreign_key_to_existing_column.md#prevent-invalid-records) + + The name of the foreign key constraint must be changed to add a new + foreign key before removing the old one. + + ```ruby + class ReplaceFkOnPackagesPackagesProjectId < Gitlab::Database::Migration[2.0] + disable_ddl_transaction! + + NEW_CONSTRAINT_NAME = 'fk_new' + + def up + add_concurrent_foreign_key(:packages_packages, :projects, column: :project_id, on_delete: :nullify, validate: false, name: NEW_CONSTRAINT_NAME) + end + + def down + with_lock_retries do + remove_foreign_key_if_exists(:packages_packages, column: :project_id, on_delete: :nullify, name: NEW_CONSTRAINT_NAME) + end + end + end + ``` + +1. [Validate the new foreign key](add_foreign_key_to_existing_column.md#validate-the-foreign-key) + + ```ruby + class ValidateFkNew < Gitlab::Database::Migration[2.0] + NEW_CONSTRAINT_NAME = 'fk_new' + + # foreign key added in <link to MR or path to migration adding new FK> + def up + validate_foreign_key(:packages_packages, name: NEW_CONSTRAINT_NAME) + end + + def down + # no-op + end + end + ``` + +1. Remove the old foreign key: + + ```ruby + class RemoveFkOld < Gitlab::Database::Migration[2.0] + OLD_CONSTRAINT_NAME = 'fk_old' + + # new foreign key added in <link to MR or path to migration adding new FK> + # and validated in <link to MR or path to migration validating new FK> + def up + remove_foreign_key_if_exists(:packages_packages, column: :project_id, on_delete: :cascade, name: OLD_CONSTRAINT_NAME) + end + + def down + # Validation is skipped here, so if rolled back, this will need to be revalidated in a separate migration + add_concurrent_foreign_key(:packages_packages, :projects, column: :project_id, on_delete: :cascade, validate: false, name: OLD_CONSTRAINT_NAME) + end + end + ``` + +## Cascading Deletes + +Every foreign key must define an `ON DELETE` clause, and in 99% of the cases +this should be set to `CASCADE`. + +## Indexes + +When adding a foreign key in PostgreSQL the column is not indexed automatically, +thus you must also add a concurrent index. Not doing so results in cascading +deletes being very slow. + +## Naming foreign keys + +By default Ruby on Rails uses the `_id` suffix for foreign keys. So we should +only use this suffix for associations between two tables. If you want to +reference an ID on a third party platform the `_xid` suffix is recommended. + +The spec `spec/db/schema_spec.rb` tests if all columns with the `_id` suffix +have a foreign key constraint. So if that spec fails, don't add the column to +`IGNORED_FK_COLUMNS`, but instead add the FK constraint, or consider naming it +differently. + +## Dependent Removals + +Don't define options such as `dependent: :destroy` or `dependent: :delete` when +defining an association. Defining these options means Rails handles the +removal of data, instead of letting the database handle this in the most +efficient way possible. + +In other words, this is bad and should be avoided at all costs: + +```ruby +class User < ActiveRecord::Base + has_many :posts, dependent: :destroy +end +``` + +Should you truly have a need for this it should be approved by a database +specialist first. + +You should also not define any `before_destroy` or `after_destroy` callbacks on +your models _unless_ absolutely required and only when approved by database +specialists. For example, if each row in a table has a corresponding file on a +file system it may be tempting to add a `after_destroy` hook. This however +introduces non database logic to a model, and means we can no longer rely on +foreign keys to remove the data as this would result in the file system data +being left behind. In such a case you should use a service class instead that +takes care of removing non database data. + +In cases where the relation spans multiple databases you have even +further problems using `dependent: :destroy` or the above hooks. You can +read more about alternatives at +[Avoid `dependent: :nullify` and `dependent: :destroy` across databases](multiple_databases.md#avoid-dependent-nullify-and-dependent-destroy-across-databases). + +## Alternative primary keys with `has_one` associations + +Sometimes a `has_one` association is used to create a one-to-one relationship: + +```ruby +class User < ActiveRecord::Base + has_one :user_config +end + +class UserConfig < ActiveRecord::Base + belongs_to :user +end +``` + +In these cases, there may be an opportunity to remove the unnecessary `id` +column on the associated table, `user_config.id` in this example. Instead, +the originating table ID can be used as the primary key for the associated +table: + +```ruby +create_table :user_configs, id: false do |t| + t.references :users, primary_key: true, default: nil, index: false, foreign_key: { on_delete: :cascade } + ... +end +``` + +Setting `default: nil` ensures a primary key sequence is not created, and since the primary key +automatically gets an index, we set `index: false` to avoid creating a duplicate. +You also need to add the new primary key to the model: + +```ruby +class UserConfig < ActiveRecord::Base + self.primary_key = :user_id + + belongs_to :user +end +``` + +Using a foreign key as primary key saves space but can make +[batch counting](../service_ping/implement.md#batch-counters) in [Service Ping](../service_ping/index.md) less efficient. +Consider using a regular `id` column if the table is relevant for Service Ping. |