Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/database/multiple_databases.md')
-rw-r--r--doc/development/database/multiple_databases.md118
1 files changed, 107 insertions, 11 deletions
diff --git a/doc/development/database/multiple_databases.md b/doc/development/database/multiple_databases.md
index 0fd9f821fab..0ba752ba3a6 100644
--- a/doc/development/database/multiple_databases.md
+++ b/doc/development/database/multiple_databases.md
@@ -6,16 +6,14 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Multiple Databases
-In order to scale GitLab, the GitLab application database
-will be [decomposed into multiple
-databases](https://gitlab.com/groups/gitlab-org/-/epics/6168).
+To scale GitLab, the we are
+[decomposing the GitLab application database into multiple databases](https://gitlab.com/groups/gitlab-org/-/epics/6168).
-## CI Database
+## CI/CD Database
-Support for configuring the GitLab Rails application to use a distinct
-database for CI tables was added in [GitLab
-14.1](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64289). This
-feature is still under development, and is not ready for production use.
+> Support for configuring the GitLab Rails application to use a distinct
+database for CI/CD tables was [introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64289)
+in GitLab 14.1. This feature is still under development, and is not ready for production use.
By default, GitLab is configured to use only one main database. To
opt-in to use a main database, and CI database, modify the
@@ -92,8 +90,8 @@ test: &test
### Migrations
-Any migrations that affect `Ci::CiDatabaseRecord` models
-and their tables must be placed in two directories for now:
+Place any migrations that affect `Ci::CiDatabaseRecord` models
+and their tables in two directories:
- `db/migrate`
- `db/ci_migrate`
@@ -394,7 +392,8 @@ You can see a real example of using this method for fixing a cross-join in
#### Allowlist for existing cross-joins
A cross-join across databases can be explicitly allowed by wrapping the code in the
-`::Gitlab::Database.allow_cross_joins_across_databases` helper method.
+`::Gitlab::Database.allow_cross_joins_across_databases` helper method. Alternative
+way is to mark a given relation as `relation.allow_cross_joins_across_databases`.
This method should only be used:
@@ -405,16 +404,113 @@ This method should only be used:
The `allow_cross_joins_across_databases` helper method can be used as follows:
```ruby
+# Scope the block executing a object from database
::Gitlab::Database.allow_cross_joins_across_databases(url: 'https://gitlab.com/gitlab-org/gitlab/-/issues/336590') do
subject.perform(1, 4)
end
```
+```ruby
+# Mark a relation as allowed to cross-join databases
+def find_actual_head_pipeline
+ all_pipelines
+ .allow_cross_joins_across_databases(url: 'https://gitlab.com/gitlab-org/gitlab/-/issues/336891')
+ .for_sha_or_source_sha(diff_head_sha)
+ .first
+end
+```
+
The `url` parameter should point to an issue with a milestone for when we intend
to fix the cross-join. If the cross-join is being used in a migration, we do not
need to fix the code. See <https://gitlab.com/gitlab-org/gitlab/-/issues/340017>
for more details.
+### Removing cross-database transactions
+
+When dealing with multiple databases, it's important to pay close attention to data modification
+that affects more than one database.
+[Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/339811) GitLab 14.4, an automated check
+prevents cross-database modifications.
+
+When at least two different databases are modified during a transaction initiated on any database
+server, the application triggers a cross-database modification error (only in test environment).
+
+Example:
+
+```ruby
+# Open transaction on Main DB
+ApplicationRecord.transaction do
+ ci_build.update!(updated_at: Time.current) # UPDATE on CI DB
+ ci_build.project.update!(updated_at: Time.current) # UPDATE on Main DB
+end
+# raises error: Cross-database data modification of 'main, ci' were detected within
+# a transaction modifying the 'ci_build, projects' tables
+```
+
+The code example above updates the timestamp for two records within a transaction. With the
+ongoing work on the CI database decomposition, we cannot ensure the schematics of a database
+transaction.
+If the second update query fails, the first update query will not be
+rolled back because the `ci_build` record is located on a different database server. For
+more information, look at the
+[transaction guidelines](transaction_guidelines.md#dangerous-example-third-party-api-calls)
+page.
+
+#### Fixing cross-database errors
+
+##### Removing the transaction block
+
+Without an open transaction, the cross-database modification check cannot raise an error.
+By making this change, we sacrifice consistency. In case of an application failure after the
+first `UPDATE` query, the second `UPDATE` query will never execute.
+
+The same code without the `transaction` block:
+
+```ruby
+ci_build.update!(updated_at: Time.current) # CI DB
+ci_build.project.update!(updated_at: Time.current) # Main DB
+```
+
+##### Async processing
+
+If we need more guarantee that an operation finishes the work consistently we can execute it
+within a background job. A background job is scheduled asynchronously and retried several times
+in case of an error. There is still a very small chance of introducing inconsistency.
+
+Example:
+
+```ruby
+current_time = Time.current
+
+MyAsyncConsistencyJob.perform_async(cu_build.id)
+
+ci_build.update!(updated_at: current_time)
+ci_build.project.update!(updated_at: current_time)
+```
+
+The `MyAsyncConsistencyJob` would also attempt to update the timestamp if they differ.
+
+##### Aiming for perfect consistency
+
+At this point, we don't have the tooling (we might not even need it) to ensure similar consistency
+characteristics as we had with one database. If you think that the code you're working on requires
+these properties, then you can disable the cross-database modification check by wrapping to
+offending database queries with a block and create a follow-up issue mentioning the sharding group
+(`gitlab-org/sharding-group`).
+
+```ruby
+Gitlab::Database.allow_cross_joins_across_databases(url: 'gitlab issue URL') do
+ ApplicationRecord.transaction do
+ ci_build.update!(updated_at: Time.current) # UPDATE on CI DB
+ ci_build.project.update!(updated_at: Time.current) # UPDATE on Main DB
+ end
+end
+```
+
+Don't hesitate to reach out to the
+[sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding/)
+for advice.
+
## `config/database.yml`
GitLab will support running multiple databases in the future, for example to [separate tables for the continuous integration features](https://gitlab.com/groups/gitlab-org/-/epics/6167) from the main database. In order to prepare for this change, we [validate the structure of the configuration](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67877) in `database.yml` to ensure that only known databases are used.