Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/database')
-rw-r--r--doc/development/database/add_foreign_key_to_existing_column.md2
-rw-r--r--doc/development/database/adding_database_indexes.md16
-rw-r--r--doc/development/database/batched_background_migrations.md6
-rw-r--r--doc/development/database/ci_mirrored_tables.md4
-rw-r--r--doc/development/database/client_side_connection_pool.md4
-rw-r--r--doc/development/database/database_debugging.md2
-rw-r--r--doc/development/database/loose_foreign_keys.md2
-rw-r--r--doc/development/database/multiple_databases.md2
-rw-r--r--doc/development/database/not_null_constraints.md9
-rw-r--r--doc/development/database/ordering_table_columns.md2
-rw-r--r--doc/development/database/strings_and_the_text_data_type.md2
-rw-r--r--doc/development/database/understanding_explain_plans.md18
12 files changed, 39 insertions, 30 deletions
diff --git a/doc/development/database/add_foreign_key_to_existing_column.md b/doc/development/database/add_foreign_key_to_existing_column.md
index 8a8fe3c0a1e..4be3296b2bb 100644
--- a/doc/development/database/add_foreign_key_to_existing_column.md
+++ b/doc/development/database/add_foreign_key_to_existing_column.md
@@ -71,7 +71,7 @@ Migration file for adding `NOT VALID` foreign key:
```ruby
class AddNotValidForeignKeyToEmailsUser < Gitlab::Database::Migration[2.0]
def up
- add_concurrent_foreign_key :emails, :users, on_delete: :cascade, validate: false
+ add_concurrent_foreign_key :emails, :users, column: :user_id, on_delete: :cascade, validate: false
end
def down
diff --git a/doc/development/database/adding_database_indexes.md b/doc/development/database/adding_database_indexes.md
index 8abd7c8298e..774703bd54f 100644
--- a/doc/development/database/adding_database_indexes.md
+++ b/doc/development/database/adding_database_indexes.md
@@ -328,8 +328,8 @@ asynchronously during weekend hours. Due to generally lower traffic and fewer de
index destruction can proceed at a lower level of risk.
1. [Schedule the index to be removed](#schedule-the-index-to-be-removed).
-1. [Verify the MR was deployed and the index exists in production](#verify-the-mr-was-deployed-and-the-index-exists-in-production).
-1. [Add a migration to create the index synchronously](#add-a-migration-to-create-the-index-synchronously).
+1. [Verify the MR was deployed and the index exists in production](#verify-the-mr-was-deployed-and-the-index-no-longer-exists-in-production).
+1. [Add a migration to destroy the index synchronously](#add-a-migration-to-destroy-the-index-synchronously).
### Schedule the index to be removed
@@ -357,21 +357,21 @@ to remove them.
You must test the database index changes locally before creating a merge request.
-### Verify the MR was deployed and the index exists in production
+### Verify the MR was deployed and the index no longer exists in production
You can verify if the MR was deployed to GitLab.com with
`/chatops run auto_deploy status <merge_sha>`. To verify the existence of
the index, you can:
- Use a meta-command in `#database-lab`, for example: `\d <index_name>`.
- - Make sure the index is not [`invalid`](https://www.postgresql.org/docs/12/sql-createindex.html#:~:text=The%20psql%20%5Cd%20command%20will%20report%20such%20an%20index%20as%20INVALID).
+- Make sure the index no longer exists
- Ask someone in `#database` to check if the index exists.
- If you have access, you can verify directly on production or in a
production clone.
### Add a migration to destroy the index synchronously
-After you verify the index exists in the production database, create a second
+After you verify the index no longer exists in the production database, create a second
merge request that removes the index synchronously. The schema changes must be
updated and committed to `structure.sql` in this second merge request.
The synchronous migration results in a no-op on GitLab.com, but you should still add the
@@ -379,7 +379,7 @@ migration as expected for other installations. For example, to
create the second migration for the previous asynchronous example:
**WARNING:**
-Verify that the index no longer exist in production before merging a second migration with `remove_concurrent_index_by_name`.
+Verify that the index no longer exists in production before merging a second migration with `remove_concurrent_index_by_name`.
If the second migration is deployed before the index has been destroyed,
the index is destroyed synchronously when the second migration executes.
@@ -395,7 +395,7 @@ def up
end
def down
- add_concurrent_index :ci_builds, :some_column, INDEX_NAME
+ add_concurrent_index :ci_builds, :some_column, name: INDEX_NAME
end
```
@@ -403,7 +403,7 @@ end
To test changes for removing an index, use the asynchronous index helpers on your local environment:
-1. Enable the feature flags by running `Feature.enable(:database_async_index_destruction)` and `Feature.enable(:database_reindexing)` in the Rails console.
+1. Enable the feature flags by running `Feature.enable(:database_reindexing)` in the Rails console.
1. Run `bundle exec rails db:migrate` which should create an entry in the `postgres_async_indexes` table.
1. Run `bundle exec rails gitlab:db:reindex` destroy the index asynchronously.
1. To verify the index, open the PostgreSQL console by using the [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/postgresql.md)
diff --git a/doc/development/database/batched_background_migrations.md b/doc/development/database/batched_background_migrations.md
index edb22fcf436..192cd0d3e49 100644
--- a/doc/development/database/batched_background_migrations.md
+++ b/doc/development/database/batched_background_migrations.md
@@ -233,7 +233,7 @@ class CopyColumnUsingBackgroundMigrationJob < BatchedMigrationJob
end
```
-### Additional filters
+## Additional filters
By default, when creating background jobs to perform the migration, batched background migrations
iterate over the full specified table. This iteration is done using the
@@ -276,6 +276,10 @@ In the second (filtered) example, we know exactly 100 will be updated with each
end
```
+ NOTE:
+ For EE migrations that define `scope_to`, ensure the module extends `ActiveSupport::Concern`.
+ Otherwise, records are processed without taking the scope into consideration.
+
1. In the post-deployment migration, enqueue the batched background migration:
```ruby
diff --git a/doc/development/database/ci_mirrored_tables.md b/doc/development/database/ci_mirrored_tables.md
index 06f0087fafe..1d285e607fa 100644
--- a/doc/development/database/ci_mirrored_tables.md
+++ b/doc/development/database/ci_mirrored_tables.md
@@ -10,9 +10,9 @@ info: To determine the technical writer assigned to the Stage/Group associated w
As part of the database [decomposition work](https://gitlab.com/groups/gitlab-org/-/epics/6168),
which had the goal of splitting the single database GitLab is using, into two databases: `main` and
-`ci`, came the big challenge of
+`ci`, came the big challenge of
[removing all joins between the `main` and the `ci` tables](multiple_databases.md#removing-joins-between-ci-and-non-ci-tables).
-That is because PostgreSQL doesn't support joins between tables that belong to different databases.
+That is because PostgreSQL doesn't support joins between tables that belong to different databases.
However, some core application models in the main database are queried very often by the CI side.
For example:
diff --git a/doc/development/database/client_side_connection_pool.md b/doc/development/database/client_side_connection_pool.md
index 3cd0e836a8d..3143391a553 100644
--- a/doc/development/database/client_side_connection_pool.md
+++ b/doc/development/database/client_side_connection_pool.md
@@ -10,7 +10,7 @@ Ruby processes accessing the database through
ActiveRecord, automatically calculate the connection-pool size for the
process based on the concurrency.
-Because of the way [Ruby on Rails manages database connections](#connection-lifecycle),
+Because of the way [Ruby on Rails manages database connections](#connection-lifecycle),
it is important that we have at
least as many connections as we have threads. While there is a 'pool'
setting in [`database.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/database.yml.postgresql), it is not very practical because you need to
@@ -28,7 +28,7 @@ because connections are instantiated lazily.
## Troubleshooting connection-pool issues
-The connection-pool usage can be seen per environment in the
+The connection-pool usage can be seen per environment in the
[connection-pool saturation dashboard](https://dashboards.gitlab.net/d/alerts-sat_rails_db_connection_pool/alerts-rails_db_connection_pool-saturation-detail?orgId=1).
If the connection-pool is too small, this would manifest in
diff --git a/doc/development/database/database_debugging.md b/doc/development/database/database_debugging.md
index 5921dc942f2..591e526cc96 100644
--- a/doc/development/database/database_debugging.md
+++ b/doc/development/database/database_debugging.md
@@ -4,7 +4,7 @@ group: Database
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
---
-# Troubleshooting and Debugging Database
+# Troubleshooting and debugging the database
This section is to help give some copy-pasta you can use as a reference when you
run into some head-banging database problems.
diff --git a/doc/development/database/loose_foreign_keys.md b/doc/development/database/loose_foreign_keys.md
index 8dbccf048d7..0af12939629 100644
--- a/doc/development/database/loose_foreign_keys.md
+++ b/doc/development/database/loose_foreign_keys.md
@@ -221,7 +221,7 @@ ON DELETE CASCADE;
```
The migration must run after the `DELETE` trigger is installed and the loose
-foreign key definition is deployed. As such, it must be a
+foreign key definition is deployed. As such, it must be a
[post-deployment migration](post_deployment_migrations.md) dated after the migration for the
trigger. If the foreign key is deleted earlier, there is a good chance of
introducing data inconsistency which needs manual cleanup:
diff --git a/doc/development/database/multiple_databases.md b/doc/development/database/multiple_databases.md
index 31fc454f8a7..034a2c2e438 100644
--- a/doc/development/database/multiple_databases.md
+++ b/doc/development/database/multiple_databases.md
@@ -7,7 +7,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Multiple Databases
To allow GitLab to scale further we
-[decomposed the GitLab application database into multiple databases](https://gitlab.com/groups/gitlab-org/-/epics/6168).
+[decomposed the GitLab application database into multiple databases](https://gitlab.com/groups/gitlab-org/-/epics/6168).
The two databases are `main` and `ci`. GitLab supports being run with either one database or two databases.
On GitLab.com we are using two separate databases.
diff --git a/doc/development/database/not_null_constraints.md b/doc/development/database/not_null_constraints.md
index 9b3d017b09f..cd2adc3ca28 100644
--- a/doc/development/database/not_null_constraints.md
+++ b/doc/development/database/not_null_constraints.md
@@ -53,8 +53,13 @@ end
## Add a `NOT NULL` constraint to an existing column
-Adding `NOT NULL` to existing database columns requires multiple steps split into at least two
-different releases:
+Adding `NOT NULL` to existing database columns usually requires multiple steps split into at least two
+different releases. If your table is small enough that you don't need to
+use a background migration, you can include all these in the same merge
+request. We recommend to use separate migrations to reduce
+transaction durations.
+
+The steps required are:
1. Release `N.M` (current release)
diff --git a/doc/development/database/ordering_table_columns.md b/doc/development/database/ordering_table_columns.md
index 7cd3d4fb208..a16df6a4499 100644
--- a/doc/development/database/ordering_table_columns.md
+++ b/doc/development/database/ordering_table_columns.md
@@ -117,7 +117,7 @@ divided into fixed size chunks as follows:
This means that excluding the variable sized data and tuple header, we need at
least 8 * 6 = 48 bytes per row.
-We can optimise this by using the following column order instead:
+We can optimize this by using the following column order instead:
| Column | Type | Size |
|:--------------|:----------------------------|:---------|
diff --git a/doc/development/database/strings_and_the_text_data_type.md b/doc/development/database/strings_and_the_text_data_type.md
index e2e1191018b..4b5d1fc8f72 100644
--- a/doc/development/database/strings_and_the_text_data_type.md
+++ b/doc/development/database/strings_and_the_text_data_type.md
@@ -148,7 +148,7 @@ to update the `title_html` with a title that has more than 1024 characters, the
a database error.
Adding or removing a constraint to an existing attribute requires that any application changes are
-deployed _first_,
+deployed _first_,
otherwise servers still in the old version of the application
[may try to update the attribute with invalid values](../multi_version_compatibility.md#ci-artifact-uploads-were-failing).
For these reasons, `add_text_limit` should run in a post-deployment migration.
diff --git a/doc/development/database/understanding_explain_plans.md b/doc/development/database/understanding_explain_plans.md
index 446a84d5232..c3cb408b35f 100644
--- a/doc/development/database/understanding_explain_plans.md
+++ b/doc/development/database/understanding_explain_plans.md
@@ -252,7 +252,7 @@ A scan on an index that required retrieving some data from the table.
Bitmap scans fall between sequential scans and index scans. These are typically
used when we would read too much data from an index scan, but too little to
-perform a sequential scan. A bitmap scan uses what is known as a
+perform a sequential scan. A bitmap scan uses what is known as a
[bitmap index](https://en.wikipedia.org/wiki/Bitmap_index) to perform its work.
The [source code of PostgreSQL](https://gitlab.com/postgres/postgres/blob/REL_11_STABLE/src/include/nodes/plannodes.h#L441)
@@ -295,9 +295,9 @@ because the previous node produced 36 rows.
This means that nested loops can quickly slow the query down if the various
child nodes keep producing many rows.
-## Optimising queries
+## Optimizing queries
-With that out of the way, let's see how we can optimise a query. Let's use the
+With that out of the way, let's see how we can optimize a query. Let's use the
following query as an example:
```sql
@@ -453,7 +453,7 @@ this works is that now PostgreSQL no longer needs to apply a `Filter`, as the
index only contains `twitter` values that are not empty.
Keep in mind that you shouldn't just add partial indexes every time you want to
-optimise a query. Every index has to be updated for every write, and they may
+optimize a query. Every index has to be updated for every write, and they may
require quite a bit of space, depending on the amount of indexed data. As a
result, first check if there are any existing indexes you may be able to reuse.
If there aren't any, check if you can perhaps slightly change an existing one to
@@ -471,10 +471,10 @@ buffer numbers. [Database Lab Engine](#database-lab-engine) guarantees that the
identical to production (and overall number of buffers is the same as on production),
but difference in cache state and I/O speed may lead to different timings.
-## Queries that can't be optimised
+## Queries that can't be optimized
-Now that we have seen how to optimise a query, let's look at another query that
-we might not be able to optimise:
+Now that we have seen how to optimize a query, let's look at another query that
+we might not be able to optimize:
```sql
EXPLAIN (ANALYZE, BUFFERS)
@@ -546,7 +546,7 @@ improve this query, other than _not_ running it at all.
What is important here is that while some may recommend to straight up add an
index the moment you see a sequential scan, it is _much more important_ to first
understand what your query does, how much data it retrieves, and so on. After
-all, you can not optimise something you do not understand.
+all, you can not optimize something you do not understand.
### Cardinality and selectivity
@@ -567,7 +567,7 @@ using an index is not worth it, because it would produce almost no unique rows.
## Rewriting queries
-So the above query can't really be optimised as-is, or at least not much. But
+So the above query can't really be optimized as-is, or at least not much. But
what if we slightly change the purpose of it? What if instead of retrieving all
projects with `visibility_level` 0 or 20, we retrieve those that a user
interacted with somehow?