Add latest changes from gitlab-org/gitlab@14-1-stable-eev14.1.0-rc42

author: GitLab Bot <gitlab-bot@gitlab.com> 2021-07-20 12:55:51 +0300
committer: GitLab Bot <gitlab-bot@gitlab.com> 2021-07-20 12:55:51 +0300
commit: e8d2c2579383897a1dd7f9debd359abe8ae8373d (patch)
tree: c42be41678c2586d49a75cabce89322082698334 /doc/development/database
parent: fc845b37ec3a90aaa719975f607740c22ba6a113 (diff)
7 files changed, 226 insertions, 5 deletions
diff --git a/doc/development/database/database_reviewer_guidelines.md b/doc/development/database/database_reviewer_guidelines.md
index 16734dada13..7a9c08d9d49 100644
--- a/doc/development/database/database_reviewer_guidelines.md
+++ b/doc/development/database/database_reviewer_guidelines.md
@@ -62,6 +62,9 @@ The following guides provide a quick introduction and links to follow on more ad
 - Guide on [understanding EXPLAIN plans](../understanding_explain_plans.md).
 - [Explaining the unexplainable series in `depesz`](https://www.depesz.com/tag/unexplainable/).
 
+We also have licensed access to The Art of PostgreSQL available, if you are interested in getting access please check out the
+[issue (confidential)](https://gitlab.com/gitlab-org/database-team/team-tasks/-/issues/23).
+
 Finally, you can find various guides in the [Database guides](index.md) page that cover more specific
 topics and use cases. The most frequently required during database reviewing are the following:
 
diff --git a/doc/development/database/multiple_databases.md b/doc/development/database/multiple_databases.md
new file mode 100644
index 00000000000..2895cef86fc
--- /dev/null
+++ b/doc/development/database/multiple_databases.md
@@ -0,0 +1,101 @@
+---
+stage: Enablement
+group: Sharding
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Multiple Databases
+
+In order to scale GitLab, the GitLab application database
+will be [decomposed into multiple
+databases](https://gitlab.com/groups/gitlab-org/-/epics/6168).
+
+## CI Database
+
+Support for configuring the GitLab Rails application to use a distinct
+database for CI tables was added in [GitLab
+14.1](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/64289). This
+feature is still under development, and is not ready for production use.
+
+By default, GitLab is configured to use only one main database. To
+opt-in to use a main database, and CI database, modify the
+`config/database.yml` file to have a `main` and a `ci` database
+configurations. For example, given a `config/database.yml` like below:
+
+```yaml
+development:
+  adapter: postgresql
+  encoding: unicode
+  database: gitlabhq_development
+  host: /path/to/gdk/postgresql
+  pool: 10
+  prepared_statements: false
+  variables:
+    statement_timeout: 120s
+
+test: &test
+  adapter: postgresql
+  encoding: unicode
+  database: gitlabhq_test
+  host: /path/to/gdk/postgresql
+  pool: 10
+  prepared_statements: false
+  variables:
+    statement_timeout: 120s
+```
+
+Edit the `config/database.yml` to look like this:
+
+```yaml
+development:
+  main:
+    adapter: postgresql
+    encoding: unicode
+    database: gitlabhq_development
+    host: /path/to/gdk/postgresql
+    pool: 10
+    prepared_statements: false
+    variables:
+      statement_timeout: 120s
+  ci:
+    adapter: postgresql
+    encoding: unicode
+    database: gitlabhq_development_ci
+    migrations_paths: db/ci_migrate
+    host: /path/to/gdk/postgresql
+    pool: 10
+    prepared_statements: false
+    variables:
+      statement_timeout: 120s
+
+test: &test
+  main:
+    adapter: postgresql
+    encoding: unicode
+    database: gitlabhq_test
+    host: /path/to/gdk/postgresql
+    pool: 10
+    prepared_statements: false
+    variables:
+      statement_timeout: 120s
+  ci:
+    adapter: postgresql
+    encoding: unicode
+    database: gitlabhq_test_ci
+    migrations_paths: db/ci_migrate
+    host: /path/to/gdk/postgresql
+    pool: 10
+    prepared_statements: false
+    variables:
+      statement_timeout: 120s
+```
+
+### Migrations
+
+Any migrations that affect `Ci::BaseModel` models
+and their tables must be placed in two directories for now:
+
+- `db/migrate`
+- `db/ci_migrate`
+
+We aim to keep the schema for both tables the same across both databases.
diff --git a/doc/development/database/not_null_constraints.md b/doc/development/database/not_null_constraints.md
index 9d1850f5504..48b198b46bd 100644
--- a/doc/development/database/not_null_constraints.md
+++ b/doc/development/database/not_null_constraints.md
@@ -175,7 +175,7 @@ class CleanupEpicsWithNullDescription < ActiveRecord::Migration[6.0]
 end
 ```
 
-#### Validate the text limit (next release)
+#### Validate the `NOT NULL` constraint (next release)
 
 Validating the `NOT NULL` constraint will scan the whole table and make sure that each record is correct.
 
@@ -201,7 +201,7 @@ end
 
 ## `NOT NULL` constraints on large tables
 
-If you have to clean up a text column for a [high-traffic table](../migration_style_guide.md#high-traffic-tables)
+If you have to clean up a nullable column for a [high-traffic table](../migration_style_guide.md#high-traffic-tables)
 (for example, the `artifacts` in `ci_builds`), your background migration will go on for a while and
 it will need an additional [background migration cleaning up](../background_migrations.md#cleaning-up)
 in the release after adding the data migration.
diff --git a/doc/development/database/pagination_guidelines.md b/doc/development/database/pagination_guidelines.md
index ce656851f86..b7209b4ca30 100644
--- a/doc/development/database/pagination_guidelines.md
+++ b/doc/development/database/pagination_guidelines.md
@@ -66,7 +66,7 @@ Offset-based pagination is the easiest way to paginate over records, however, it
 - Avoid using page numbers, use next and previous page buttons.
   - Keyset pagination doesn't support page numbers.
 - For APIs, advise against building URLs for the next page by "hand".
-  - Promote the usage of the [`Link` header](../../api/README.md#pagination-link-header) where the URLs for the next and previous page are provided by the backend.
+  - Promote the usage of the [`Link` header](../../api/index.md#pagination-link-header) where the URLs for the next and previous page are provided by the backend.
   - This way changing the URL structure is possible without breaking backward compatibility.
 
 NOTE:
diff --git a/doc/development/database/pagination_performance_guidelines.md b/doc/development/database/pagination_performance_guidelines.md
index ade1e853027..90e4faf2de7 100644
--- a/doc/development/database/pagination_performance_guidelines.md
+++ b/doc/development/database/pagination_performance_guidelines.md
@@ -5,7 +5,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
 ---
 
 # Pagination performance guidelines
-                                                                                   
+
 The following document gives a few ideas for improving the pagination (sorting) performance. These apply both on [offset](pagination_guidelines.md#offset-pagination) and [keyset](pagination_guidelines.md#keyset-pagination) paginations.
 
 ## Tie-breaker column
diff --git a/doc/development/database/rename_database_tables.md b/doc/development/database/rename_database_tables.md
index 743558fae19..8ac50d2c0a0 100644
--- a/doc/development/database/rename_database_tables.md
+++ b/doc/development/database/rename_database_tables.md
@@ -81,7 +81,7 @@ Execute a standard migration (not a post-migration):
 when naming indexes, so there is a possibility that not all indexes are properly renamed. After running
 the migration locally, check if there are inconsistently named indexes (`db/structure.sql`). Those can be
 renamed manually in a separate migration, which can be also part of the release M.N+1.
-- Foreign key columns might still contain the old table name. For smaller tables, follow our [standard column 
+- Foreign key columns might still contain the old table name. For smaller tables, follow our [standard column
 rename process](../avoiding_downtime_in_migrations.md#renaming-columns)
 - Avoid renaming database tables which are using with triggers.
 - Table modifications (add or remove columns) are not allowed during the rename process, please make sure that all changes to the table happen before the rename migration is started (or in the next release).
diff --git a/doc/development/database/transaction_guidelines.md b/doc/development/database/transaction_guidelines.md
new file mode 100644
index 00000000000..1c25496b153
--- /dev/null
+++ b/doc/development/database/transaction_guidelines.md
@@ -0,0 +1,117 @@
+---
+stage: Enablement
+group: Database
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Transaction guidelines
+
+This document gives a few examples of the usage of database transactions in application code.
+
+For further reference please check PostgreSQL documentation about [transactions](https://www.postgresql.org/docs/current/tutorial-transactions.html).
+
+## Database decomposition and sharding
+
+The [sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding) plans to split the main GitLab database and move some of the database tables to other database servers.
+
+The group will start decomposing the `ci_*` related database tables first. To maintain the current application development experience, tooling and static analyzers will be added to the codebase to ensure correct data access and data modification methods. By using the correct form for defining database transactions, we can save significant refactoring work in the future.
+
+## The transaction block
+
+The `ActiveRecord` library provides a convenient way to group database statements into a transaction.
+
+```ruby
+issue = Issue.find(10)
+project = issue.project
+
+ApplicationRecord.transaction do
+  issue.update!(title: 'updated title')
+  project.update!(last_update_at: Time.now)
+end
+```
+
+This transaction involves two database tables, in case of an error, each `UPDATE` statement will be rolled back to the previous, consistent state.
+
+NOTE:
+Avoid referencing the `ActiveRecord::Base` class and use `ApplicationRecord` instead.
+
+## Transaction and database locks
+
+When a transaction block is opened, the database will try to acquire the necessary locks on the resources. The type of locks will depend on the actual database statements.
+
+Consider a concurrent update scenario where the following code is executed at the same time from two different processes:
+
+```ruby
+issue = Issue.find(10)
+project = issue.project
+
+ApplicationRecord.transaction do
+  issue.update!(title: 'updated title')
+  project.update!(last_update_at: Time.now)
+end
+```
+
+The database will try to acquire the `FOR UPDATE` lock for the referenced `issue` and `project` records. In our case, we have two competing transactions for these locks, one of them will successfully acquire them. The other transaction will have to wait in the lock queue until the first transaction finishes. The execution of the second transaction is blocked at this point.
+
+## Transaction speed
+
+To prevent lock contention and maintain stable application performance, the transaction block should finish as fast as possible. When a transaction acquires locks, it will hold on to them until the transaction finishes.
+
+Apart from application performance, long-running transactions can also affect the application upgrade processes by blocking database migrations.
+
+### Dangerous example: 3rd party API calls
+
+Consider the following example:
+
+```ruby
+member = Member.find(5)
+
+Member.transaction do
+  member.update!(notification_email_sent: true)
+
+  member.send_notification_email
+end
+```
+
+Here, we ensure that the `notification_email_sent` column is updated only when the `send_notification_email` method succeeds. The `send_notification_email` method executes a network request to an email sending service. If the underlying infrastructure does not specify timeouts or the network call takes too long time, the database transaction will stay open.
+
+Ideally, a transaction should only contain database statements.
+
+Avoid doing in a `transaction` block:
+
+- External network requests such as: triggering Sidekiq jobs, sending emails, HTTP API calls and running database statements using a different connection.
+- File system operations.
+- Long, CPU intensive computation.
+- Calling `sleep(n)`.
+
+## Explicit model referencing
+
+If a transaction modifies records from the same database table, it's advised to use the `Model.transaction` block:
+
+```ruby
+build_1 = Ci::Build.find(1)
+build_2 = Ci::Build.find(2)
+
+Ci::Build.transaction do
+  build_1.touch
+  build_2.touch
+end
+```
+
+The transaction above will use the same database connection for the transaction as the models in the `transaction` block. In a multi-database environment the following example would be dangerous:
+
+```ruby
+# `ci_builds` table is located on another database
+class Ci::Build < CiDatabase
+end
+
+build_1 = Ci::Build.find(1)
+build_2 = Ci::Build.find(2)
+
+ActiveRecord::Base.transaction do
+  build_1.touch
+  build_2.touch
+end
+```
+
+The `ActiveRecord::Base` class uses a different database connection than the `Ci::Build` records. The two statements in the transaction block will not be part of the transaction and will not be rolled back in case something goes wrong. They act as 3rd part calls.
author	GitLab Bot <gitlab-bot@gitlab.com>	2021-07-20 12:55:51 +0300
committer	GitLab Bot <gitlab-bot@gitlab.com>	2021-07-20 12:55:51 +0300
commit	e8d2c2579383897a1dd7f9debd359abe8ae8373d (patch)
tree	c42be41678c2586d49a75cabce89322082698334 /doc/development/database
parent	fc845b37ec3a90aaa719975f607740c22ba6a113 (diff)