Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2022-06-21 03:08:43 +0300
committerGitLab Bot <gitlab-bot@gitlab.com>2022-06-21 03:08:43 +0300
commit92ea86691a2a6b3df4b36c7ff00001410303a701 (patch)
tree9b3764b56303c9b65e17007c589a297775834b28 /doc/development
parent991c66333dc7bdb0fd6f7a0b7f7bdf8383285975 (diff)
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc/development')
-rw-r--r--doc/development/adding_database_indexes.md107
-rw-r--r--doc/development/cicd/pipeline_wizard.md18
-rw-r--r--doc/development/database/add_foreign_key_to_existing_column.md21
-rw-r--r--doc/development/foreign_keys.md73
-rw-r--r--doc/development/gitlab_flavored_markdown/specification_guide/index.md6
-rw-r--r--doc/development/import_project.md35
6 files changed, 188 insertions, 72 deletions
diff --git a/doc/development/adding_database_indexes.md b/doc/development/adding_database_indexes.md
index f524b04c6eb..050153963a9 100644
--- a/doc/development/adding_database_indexes.md
+++ b/doc/development/adding_database_indexes.md
@@ -20,27 +20,24 @@ WHERE user_id = 2;
Here we are filtering by the `user_id` column and as such a developer may decide
to index this column.
-While in certain cases indexing columns using the above approach may make sense
-it can actually have a negative impact. Whenever you write data to a table any
-existing indexes need to be updated. The more indexes there are the slower this
-can potentially become. Indexes can also take up quite some disk space depending
+While in certain cases indexing columns using the above approach may make sense,
+it can actually have a negative impact. Whenever you write data to a table, any
+existing indexes must also be updated. The more indexes there are, the slower this
+can potentially become. Indexes can also take up significant disk space, depending
on the amount of data indexed and the index type. For example, PostgreSQL offers
-"GIN" indexes which can be used to index certain data types that can not be
-indexed by regular B-tree indexes. These indexes however generally take up more
+`GIN` indexes which can be used to index certain data types that cannot be
+indexed by regular B-tree indexes. These indexes, however, generally take up more
data and are slower to update compared to B-tree indexes.
-Because of all this one should not blindly add a new index for every column used
-to filter data by. Instead one should ask themselves the following questions:
+Because of all this, it's important make the following considerations
+when adding a new index:
-1. Can you write your query in such a way that it re-uses as many existing indexes
- as possible?
-1. Is the data large enough that using an index is actually
- faster than just iterating over the rows in the table?
+1. Do the new queries re-use as many existing indexes as possible?
+1. Is there enough data that using an index is faster than iterating over
+ rows in the table?
1. Is the overhead of maintaining the index worth the reduction in query
timings?
-We explore every question in detail below.
-
## Re-using Queries
The first step is to make sure your query re-uses as many existing indexes as
@@ -59,10 +56,8 @@ unindexed. In reality the query may perform just fine given the index on
`user_id` can filter out enough rows.
The best way to determine if indexes are re-used is to run your query using
-`EXPLAIN ANALYZE`. Depending on any extra tables that may be joined and
-other columns being used for filtering you may find an extra index is not going
-to make much (if any) difference. On the other hand you may determine that the
-index _may_ make a difference.
+`EXPLAIN ANALYZE`. Depending on the joined tables and the columns being used for filtering,
+you may find an extra index doesn't make much, if any, difference.
In short:
@@ -73,28 +68,24 @@ In short:
## Data Size
-A database may decide not to use an index despite it existing in case a regular
-sequence scan (= simply iterating over all existing rows) is faster. This is
-especially the case for small tables.
+A database may not use an index even when a regular sequence scan
+(iterating over all rows) is faster, especially for small tables.
-If a table is expected to grow in size and you expect your query has to filter
-out a lot of rows you may want to consider adding an index. If the table size is
-very small (for example, fewer than `1,000` records) or any existing indexes filter out
-enough rows you may _not_ want to add a new index.
+Consider adding an index if a table is expected to grow, and your query has to filter a lot of rows.
+You may _not_ want to add an index if the table size is small (<`1,000` records),
+or if existing indexes already filter out enough rows.
## Maintenance Overhead
-Indexes have to be updated on every table write. In case of PostgreSQL _all_
+Indexes have to be updated on every table write. In the case of PostgreSQL, _all_
existing indexes are updated whenever data is written to a table. As a
-result of this having many indexes on the same table slows down writes.
-
-Because of this one should ask themselves: is the reduction in query performance
-worth the overhead of maintaining an extra index?
+result, having many indexes on the same table slows down writes. It's therefore important
+to balance query performance with the overhead of maintaining an extra index.
-If adding an index reduces SELECT timings by 5 milliseconds but increases
-INSERT/UPDATE/DELETE timings by 10 milliseconds then the index may not be worth
-it. On the other hand, if SELECT timings are reduced but INSERT/UPDATE/DELETE
-timings are not affected you may want to add the index after all.
+Let's say that adding an index reduces SELECT timings by 5 milliseconds but increases
+INSERT/UPDATE/DELETE timings by 10 milliseconds. In this case, the new index may not be worth
+it. A new index is more valuable when SELECT timings are reduced and INSERT/UPDATE/DELETE
+timings are unaffected.
## Finding Unused Indexes
@@ -111,26 +102,25 @@ ORDER BY pg_relation_size(indexrelname::regclass) desc;
```
This query outputs a list containing all indexes that are never used and sorts
-them by indexes sizes in descending order. This query can be useful to
-determine if any previously indexes are useful after all. More information on
+them by indexes sizes in descending order. This query helps in
+determining whether existing indexes are still required. More information on
the meaning of the various columns can be found at
<https://www.postgresql.org/docs/current/monitoring-stats.html>.
-Because the output of this query relies on the actual usage of your database it
-may be affected by factors such as (but not limited to):
+Because the query output relies on the actual usage of your database, it
+may be affected by factors such as:
- Certain queries never being executed, thus not being able to use certain
indexes.
- Certain tables having little data, resulting in PostgreSQL using sequence
scans instead of index scans.
-In other words, this data is only reliable for a frequently used database with
-plenty of data and with as many GitLab features enabled (and being used) as
-possible.
+This data is only reliable for a frequently used database with
+plenty of data, and using as many GitLab features as possible.
## Requirements for naming indexes
-Indexes with complex definitions need to be explicitly named rather than
+Indexes with complex definitions must be explicitly named rather than
relying on the implicit naming behavior of migration methods. In short,
that means you **must** provide an explicit name argument for an index
created with one or more of the following options:
@@ -172,7 +162,7 @@ end
Creation of the second index would fail, because Rails would generate
the same name for both indexes.
-This is further complicated by the behavior of the `index_exists?` method.
+This naming issue is further complicated by the behavior of the `index_exists?` method.
It considers only the table name, column names, and uniqueness specification
of the index when making a comparison. Consider:
@@ -188,7 +178,7 @@ The call to `index_exists?` returns true if **any** index exists on
`:my_table` and `:my_column`, and index creation is bypassed.
The `add_concurrent_index` helper is a requirement for creating indexes
-on populated tables. Since it cannot be used inside a transactional
+on populated tables. Because it cannot be used inside a transactional
migration, it has a built-in check that detects if the index already
exists. In the event a match is found, index creation is skipped.
Without an explicit name argument, Rails can return a false positive
@@ -201,14 +191,15 @@ chance of error is greatly reduced.
There may be times when an index is only needed temporarily.
For example, in a migration, a column of a table might be conditionally
-updated. To query which columns need to be updated within the
-[query performance guidelines](query_performance.md), an index is needed that would otherwise
-not be used.
+updated. To query which columns must be updated in the
+[query performance guidelines](query_performance.md), an index is needed
+that would otherwise not be used.
-In these cases, a temporary index should be considered. To specify a
+In these cases, consider a temporary index. To specify a
temporary index:
-1. Prefix the index name with `tmp_` and follow the [naming conventions](database/constraint_naming_convention.md) and [requirements for naming indexes](#requirements-for-naming-indexes) for the rest of the name.
+1. Prefix the index name with `tmp_` and follow the [naming conventions](database/constraint_naming_convention.md)
+ and [requirements for naming indexes](#requirements-for-naming-indexes) for the rest of the name.
1. Create a follow-up issue to remove the index in the next (or future) milestone.
1. Add a comment in the migration mentioning the removal issue.
@@ -237,10 +228,10 @@ on GitLab.com, the deployment process is blocked waiting for index
creation to finish.
To limit impact on GitLab.com, a process exists to create indexes
-asynchronously during weekend hours. Due to generally lower levels of
-traffic and lack of regular deployments, this process allows the
-creation of indexes to proceed with a lower level of risk. The below
-sections describe the steps required to use these features:
+asynchronously during weekend hours. Due to generally lower traffic and fewer deployments,
+index creation can proceed at a lower level of risk.
+
+### Schedule index creation for a low-impact time
1. [Schedule the index to be created](#schedule-the-index-to-be-created).
1. [Verify the MR was deployed and the index exists in production](#verify-the-mr-was-deployed-and-the-index-exists-in-production).
@@ -291,12 +282,10 @@ migration as expected for other installations. The below block
demonstrates how to create the second migration for the previous
asynchronous example.
-WARNING:
-The responsibility lies on the individual writing the migrations to verify
-the index exists in production before merging a second migration that
-adds the index using `add_concurrent_index`. If the second migration is
-deployed and the index has not yet been created, the index is created
-synchronously when the second migration executes.
+**WARNING:**
+Verify that the index exists in production before merging a second migration with `add_concurrent_index`.
+If the second migration is deployed before the index has been created,
+the index is created synchronously when the second migration executes.
```ruby
# in db/post_migrate/
diff --git a/doc/development/cicd/pipeline_wizard.md b/doc/development/cicd/pipeline_wizard.md
index 608c21778c0..26b95dd3630 100644
--- a/doc/development/cicd/pipeline_wizard.md
+++ b/doc/development/cicd/pipeline_wizard.md
@@ -227,3 +227,21 @@ Use as `widget: list`. This inserts a `list` in the YAML file.
| `invalidFeedback` | **{dotted-circle}** No | string | Help text displayed when the pattern validation fails. |
| `default` | **{dotted-circle}** No | list | The default value for the list |
| `id` | **{dotted-circle}** No | string | The input field ID is usually autogenerated but can be overridden by providing this property. |
+
+#### Checklist
+
+Use as `widget: checklist`. This inserts a list of checkboxes that need to
+be checked before proceeding to the next step.
+
+| Name | Required | Type | Description |
+|---------|------------------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `title` | **{dotted-circle}** No | string | A title above the checklist items. |
+| `items` | **{dotted-circle}** No | list | A list of items that need to be checked. Each item corresponds to one checkbox, and can be a string or [checklist item](#checklist-item). |
+
+##### Checklist Item
+
+| Name | Required | Type | Description |
+|--------|------------------------|---------|-----------------------------------------|
+| `text` | **{check-circle}** Yes | string | A title above the checklist items. |
+| `help` | **{dotted-circle}** No | string | Help text explaining the item. |
+| `id` | **{dotted-circle}** No | string | The input field ID is usually autogenerated but can be overridden by providing this property. |
diff --git a/doc/development/database/add_foreign_key_to_existing_column.md b/doc/development/database/add_foreign_key_to_existing_column.md
index 9842814816f..7a18da2223f 100644
--- a/doc/development/database/add_foreign_key_to_existing_column.md
+++ b/doc/development/database/add_foreign_key_to_existing_column.md
@@ -64,18 +64,14 @@ emails = Email.where(user_id: 1) # returns emails for the deleted user
Add a `NOT VALID` foreign key constraint to the table, which enforces consistency on the record changes.
-[Using the `with_lock_retries` helper method is advised when performing operations on high-traffic tables](../migration_style_guide.md#when-to-use-the-helper-method),
-in this case, if the table or the foreign table is a high-traffic table, we should use the helper method.
-
In the example above, you'd be still able to update records in the `emails` table. However, when you'd try to update the `user_id` with non-existent value, the constraint causes a database error.
Migration file for adding `NOT VALID` foreign key:
```ruby
-class AddNotValidForeignKeyToEmailsUser < Gitlab::Database::Migration[1.0]
+class AddNotValidForeignKeyToEmailsUser < Gitlab::Database::Migration[2.0]
def up
- # safe to use: it requires short lock on the table since we don't validate the foreign key
- add_foreign_key :emails, :users, on_delete: :cascade, validate: false
+ add_concurrent_foreign_key :emails, :users, on_delete: :cascade, validate: false
end
def down
@@ -84,8 +80,14 @@ class AddNotValidForeignKeyToEmailsUser < Gitlab::Database::Migration[1.0]
end
```
+Adding a foreign key without validating it is a fast operation. It only requires a
+short lock on the table before being able to enforce the constraint on new data.
+We do still want to enable lock retries for high traffic and large tables.
+`add_concurrent_foreign_key` does this for us, and also checks if the foreign key already exists.
+
WARNING:
-Avoid using the `add_foreign_key` constraint more than once per migration file, unless the source and target tables are identical.
+Avoid using `add_foreign_key` or `add_concurrent_foreign_key` constraints more than
+once per migration file, unless the source and target tables are identical.
#### Data migration to fix existing records
@@ -98,7 +100,7 @@ In case the data volume is higher (>1000 records), it's better to create a backg
Example for cleaning up records in the `emails` table in a database migration:
```ruby
-class RemoveRecordsWithoutUserFromEmailsTable < Gitlab::Database::Migration[1.0]
+class RemoveRecordsWithoutUserFromEmailsTable < Gitlab::Database::Migration[2.0]
disable_ddl_transaction!
class Email < ActiveRecord::Base
@@ -121,6 +123,7 @@ end
### Validate the foreign key
Validating the foreign key scans the whole table and makes sure that each relation is correct.
+Fortunately, this does not lock the source table (`users`) while running.
NOTE:
When using [background migrations](background_migrations.md), foreign key validation should happen in the next GitLab release.
@@ -130,7 +133,7 @@ Migration file for validating the foreign key:
```ruby
# frozen_string_literal: true
-class ValidateForeignKeyOnEmailUsers < Gitlab::Database::Migration[1.0]
+class ValidateForeignKeyOnEmailUsers < Gitlab::Database::Migration[2.0]
def up
validate_foreign_key :emails, :user_id
end
diff --git a/doc/development/foreign_keys.md b/doc/development/foreign_keys.md
index 77df6fbfb0d..e0dd0fe8e7c 100644
--- a/doc/development/foreign_keys.md
+++ b/doc/development/foreign_keys.md
@@ -28,9 +28,80 @@ Guide](migration_style_guide.md) for more information.
Keep in mind that you can only safely add foreign keys to existing tables after
you have removed any orphaned rows. The method `add_concurrent_foreign_key`
-does not take care of this so you need to do so manually. See
+does not take care of this so you must do so manually. See
[adding foreign key constraint to an existing column](database/add_foreign_key_to_existing_column.md).
+## Updating Foreign Keys In Migrations
+
+Sometimes a foreign key constraint must be changed, preserving the column
+but updating the constraint condition. For example, moving from
+`ON DELETE CASCADE` to `ON DELETE SET NULL` or vice-versa.
+
+PostgreSQL does not prevent you from adding overlapping foreign keys. It
+honors the most recently added constraint. This allows us to replace foreign keys without
+ever losing foreign key protection on a column.
+
+To replace a foreign key:
+
+1. [Add the new foreign key without validation](database/add_foreign_key_to_existing_column.md#prevent-invalid-records)
+
+ The name of the foreign key constraint must be changed to add a new
+ foreign key before removing the old one.
+
+ ```ruby
+ class ReplaceFkOnPackagesPackagesProjectId < Gitlab::Database::Migration[2.0]
+ disable_ddl_transaction!
+
+ NEW_CONSTRAINT_NAME = 'fk_new'
+
+ def up
+ add_concurrent_foreign_key(:packages_packages, :projects, column: :project_id, on_delete: :nullify, validate: false, name: NEW_CONSTRAINT_NAME)
+ end
+
+ def down
+ with_lock_retries do
+ remove_foreign_key_if_exists(:packages_packages, column: :project_id, on_delete: :nullify, name: NEW_CONSTRAINT_NAME)
+ end
+ end
+ end
+ ```
+
+1. [Validate the new foreign key](database/add_foreign_key_to_existing_column.md#validate-the-foreign-key)
+
+ ```ruby
+ class ValidateFkNew < Gitlab::Database::Migration[2.0]
+ NEW_CONSTRAINT_NAME = 'fk_new'
+
+ # foreign key added in <link to MR or path to migration adding new FK>
+ def up
+ validate_foreign_key(:packages_packages, name: NEW_CONSTRAINT_NAME)
+ end
+
+ def down
+ # no-op
+ end
+ end
+ ```
+
+1. Remove the old foreign key:
+
+ ```ruby
+ class RemoveFkOld < Gitlab::Database::Migration[2.0]
+ OLD_CONSTRAINT_NAME = 'fk_old'
+
+ # new foreign key added in <link to MR or path to migration adding new FK>
+ # and validated in <link to MR or path to migration validating new FK>
+ def up
+ remove_foreign_key_if_exists(:packages_packages, column: :project_id, on_delete: :cascade, name: OLD_CONSTRAINT_NAME)
+ end
+
+ def down
+ # Validation is skipped here, so if rolled back, this will need to be revalidated in a separate migration
+ add_concurrent_foreign_key(:packages_packages, :projects, column: :project_id, on_delete: :cascade, validate: false, name: OLD_CONSTRAINT_NAME)
+ end
+ end
+ ```
+
## Cascading Deletes
Every foreign key must define an `ON DELETE` clause, and in 99% of the cases
diff --git a/doc/development/gitlab_flavored_markdown/specification_guide/index.md b/doc/development/gitlab_flavored_markdown/specification_guide/index.md
index cedf44cf1fc..94bad86080b 100644
--- a/doc/development/gitlab_flavored_markdown/specification_guide/index.md
+++ b/doc/development/gitlab_flavored_markdown/specification_guide/index.md
@@ -69,11 +69,11 @@ serve as input to automated conformance tests. It is
> This document attempts to specify Markdown syntax unambiguously. It contains many
> examples with side-by-side Markdown and HTML. These examples are intended to double as conformance tests.
-The HTML-rendered versions of the specifications:
+Here are the HTML-rendered versions of the specifications:
- [GitLab Flavored Markdown (GLFM) specification](https://gitlab.com/gitlab-org/gitlab/-/blob/master/glfm_specification/output/spec.html), which extends the:
-- [GitHub Flavored Markdown (GFM) specification](https://github.github.com/gfm/), which extends the:
-- [CommonMark specification](https://spec.commonmark.org/0.30/)
+- [GitHub Flavored Markdown (GFM) specification](https://github.github.com/gfm/) (rendered from the [source `spec.txt` for GFM specification](https://github.com/github/cmark-gfm/blob/master/test/spec.txt)), which extends the:
+- [CommonMark specification](https://spec.commonmark.org/0.30/) (rendered from the [source `spec.txt` for CommonMark specification](https://github.com/commonmark/commonmark-spec/blob/master/spec.txt))
NOTE:
The creation of the
diff --git a/doc/development/import_project.md b/doc/development/import_project.md
index e910983997c..aa767cd5b4c 100644
--- a/doc/development/import_project.md
+++ b/doc/development/import_project.md
@@ -133,6 +133,41 @@ During import, the tarball is cached in your configured `shared_path` directory.
disk has enough free space to accommodate both the cached tarball and the unpacked
project files on disk.
+##### Import is successful, but with a `Total number of not imported relations: XX` message, and issues are not created during the import
+
+If you receive a `Total number of not imported relations: XX` message, and issues
+aren't created during the import, check [exceptions_json.log](../administration/logs.md#exceptions_jsonlog).
+You might see an error like `N is out of range for ActiveModel::Type::Integer with limit 4 bytes`,
+where `N` is the integer exceeding the 4-byte integer limit. If that's the case, you
+are likely hitting the issue with rebalancing of `relative_position` field of the issues.
+
+The feature flag to enable the rebalance automatically was enabled on GitLab.com.
+We intend to enable it by default on self-managed instances when the issue
+[Rebalance issues FF rollout](https://gitlab.com/gitlab-org/gitlab/-/issues/343368)
+is implemented.
+
+If the feature is not enabled by default on your GitLab version, run the following
+commands in the [Rails console](../administration/operations/rails_console.md) as
+a workaround. Replace the ID with the ID of your project you were trying to import:
+
+```ruby
+# Check if the feature is enabled on your instance. If it is, rebalance should work automatically on your instance
+Feature.enabled?(:rebalance_issues,Project.find(ID).root_namespace)
+
+# Check the current maximum value of relative_position
+Issue.where(project_id: Project.find(ID).root_namespace.all_projects).maximum(:relative_position)
+
+# Enable `rebalance_issues` feauture and check that it was successfully enabled
+Feature.enable(:rebalance_issues,Project.find(ID).root_namespace)
+Feature.enabled?(:rebalance_issues,Project.find(ID).root_namespace)
+
+# Run the rebalancing process and check if the maximum value of relative_position has changed
+Issues::RelativePositionRebalancingService.new(Project.find(ID).root_namespace.all_projects).execute
+Issue.where(project_id: Project.find(ID).root_namespace.all_projects).maximum(:relative_position)
+```
+
+Repeat the import attempt after that and check if the issues are imported successfully.
+
### Importing via the Rails console
The last option is to import a project using a Rails console: