Add latest changes from gitlab-org/gitlab@master

author: GitLab Bot <gitlab-bot@gitlab.com> 2022-06-21 03:08:43 +0300
committer: GitLab Bot <gitlab-bot@gitlab.com> 2022-06-21 03:08:43 +0300
commit: 92ea86691a2a6b3df4b36c7ff00001410303a701 (patch)
tree: 9b3764b56303c9b65e17007c589a297775834b28 /doc/development
parent: 991c66333dc7bdb0fd6f7a0b7f7bdf8383285975 (diff)
6 files changed, 188 insertions, 72 deletions
diff --git a/doc/development/adding_database_indexes.md b/doc/development/adding_database_indexes.md
index f524b04c6eb..050153963a9 100644
--- a/doc/development/adding_database_indexes.md
+++ b/doc/development/adding_database_indexes.md
@@ -20,27 +20,24 @@ WHERE user_id = 2;
 Here we are filtering by the `user_id` column and as such a developer may decide
 to index this column.
 
-While in certain cases indexing columns using the above approach may make sense
-it can actually have a negative impact. Whenever you write data to a table any
-existing indexes need to be updated. The more indexes there are the slower this
-can potentially become. Indexes can also take up quite some disk space depending
+While in certain cases indexing columns using the above approach may make sense,
+it can actually have a negative impact. Whenever you write data to a table, any
+existing indexes must also be updated. The more indexes there are, the slower this
+can potentially become. Indexes can also take up significant disk space, depending
 on the amount of data indexed and the index type. For example, PostgreSQL offers
-"GIN" indexes which can be used to index certain data types that can not be
-indexed by regular B-tree indexes. These indexes however generally take up more
+`GIN` indexes which can be used to index certain data types that cannot be
+indexed by regular B-tree indexes. These indexes, however, generally take up more
 data and are slower to update compared to B-tree indexes.
 
-Because of all this one should not blindly add a new index for every column used
-to filter data by. Instead one should ask themselves the following questions:
+Because of all this, it's important make the following considerations
+when adding a new index:
 
-1. Can you write your query in such a way that it re-uses as many existing indexes
-   as possible?
-1. Is the data large enough that using an index is actually
-   faster than just iterating over the rows in the table?
+1. Do the new queries re-use as many existing indexes as possible?
+1. Is there enough data that using an index is faster than iterating over
+   rows in the table?
 1. Is the overhead of maintaining the index worth the reduction in query
    timings?
 
-We explore every question in detail below.
-
 ## Re-using Queries
 
 The first step is to make sure your query re-uses as many existing indexes as
@@ -59,10 +56,8 @@ unindexed. In reality the query may perform just fine given the index on
 `user_id` can filter out enough rows.
 
 The best way to determine if indexes are re-used is to run your query using
-`EXPLAIN ANALYZE`. Depending on any extra tables that may be joined and
-other columns being used for filtering you may find an extra index is not going
-to make much (if any) difference. On the other hand you may determine that the
-index _may_ make a difference.
+`EXPLAIN ANALYZE`. Depending on the joined tables and the columns being used for filtering,
+you may find an extra index doesn't make much, if any, difference.
 
 In short:
 
@@ -73,28 +68,24 @@ In short:
 
 ## Data Size
 
-A database may decide not to use an index despite it existing in case a regular
-sequence scan (= simply iterating over all existing rows) is faster. This is
-especially the case for small tables.
+A database may not use an index even when a regular sequence scan
+(iterating over all rows) is faster, especially for small tables.
 
-If a table is expected to grow in size and you expect your query has to filter
-out a lot of rows you may want to consider adding an index. If the table size is
-very small (for example, fewer than `1,000` records) or any existing indexes filter out
-enough rows you may _not_ want to add a new index.
+Consider adding an index if a table is expected to grow, and your query has to filter a lot of rows.
+You may _not_ want to add an index if the table size is small (<`1,000` records),
+or if existing indexes already filter out enough rows.
 
 ## Maintenance Overhead
 
-Indexes have to be updated on every table write. In case of PostgreSQL _all_
+Indexes have to be updated on every table write. In the case of PostgreSQL, _all_
 existing indexes are updated whenever data is written to a table. As a
-result of this having many indexes on the same table slows down writes.
-
-Because of this one should ask themselves: is the reduction in query performance
-worth the overhead of maintaining an extra index?
+result, having many indexes on the same table slows down writes. It's therefore important
+to balance query performance with the overhead of maintaining an extra index.
 
-If adding an index reduces SELECT timings by 5 milliseconds but increases
-INSERT/UPDATE/DELETE timings by 10 milliseconds then the index may not be worth
-it. On the other hand, if SELECT timings are reduced but INSERT/UPDATE/DELETE
-timings are not affected you may want to add the index after all.
+Let's say that adding an index reduces SELECT timings by 5 milliseconds but increases
+INSERT/UPDATE/DELETE timings by 10 milliseconds. In this case, the new index may not be worth
+it. A new index is more valuable when SELECT timings are reduced and INSERT/UPDATE/DELETE
+timings are unaffected.
 
 ## Finding Unused Indexes
 
@@ -111,26 +102,25 @@ ORDER BY pg_relation_size(indexrelname::regclass) desc;
 ```
 
 This query outputs a list containing all indexes that are never used and sorts
-them by indexes sizes in descending order. This query can be useful to
-determine if any previously indexes are useful after all. More information on
+them by indexes sizes in descending order. This query helps in
+determining whether existing indexes are still required. More information on
 the meaning of the various columns can be found at
 <https://www.postgresql.org/docs/current/monitoring-stats.html>.
 
-Because the output of this query relies on the actual usage of your database it
-may be affected by factors such as (but not limited to):
+Because the query output relies on the actual usage of your database, it
+may be affected by factors such as:
 
 - Certain queries never being executed, thus not being able to use certain
   indexes.
 - Certain tables having little data, resulting in PostgreSQL using sequence
   scans instead of index scans.
 
-In other words, this data is only reliable for a frequently used database with
-plenty of data and with as many GitLab features enabled (and being used) as
-possible.
+This data is only reliable for a frequently used database with
+plenty of data, and using as many GitLab features as possible.
 
 ## Requirements for naming indexes
 
-Indexes with complex definitions need to be explicitly named rather than
+Indexes with complex definitions must be explicitly named rather than
 relying on the implicit naming behavior of migration methods. In short,
 that means you **must** provide an explicit name argument for an index
 created with one or more of the following options:
@@ -172,7 +162,7 @@ end
 Creation of the second index would fail, because Rails would generate
 the same name for both indexes.
 
-This is further complicated by the behavior of the `index_exists?` method.
+This naming issue is further complicated by the behavior of the `index_exists?` method.
 It considers only the table name, column names, and uniqueness specification
 of the index when making a comparison. Consider:
 
@@ -188,7 +178,7 @@ The call to `index_exists?` returns true if **any** index exists on
 `:my_table` and `:my_column`, and index creation is bypassed.
 
 The `add_concurrent_index` helper is a requirement for creating indexes
-on populated tables. Since it cannot be used inside a transactional
+on populated tables. Because it cannot be used inside a transactional
 migration, it has a built-in check that detects if the index already
 exists. In the event a match is found, index creation is skipped.
 Without an explicit name argument, Rails can return a false positive
@@ -201,14 +191,15 @@ chance of error is greatly reduced.
 There may be times when an index is only needed temporarily.
 
 For example, in a migration, a column of a table might be conditionally
-updated. To query which columns need to be updated within the
-[query performance guidelines](query_performance.md), an index is needed that would otherwise
-not be used.
+updated. To query which columns must be updated in the
+[query performance guidelines](query_performance.md), an index is needed
+that would otherwise not be used.
 
-In these cases, a temporary index should be considered. To specify a
+In these cases, consider a temporary index. To specify a
 temporary index:
 
-1. Prefix the index name with `tmp_` and follow the [naming conventions](database/constraint_naming_convention.md) and [requirements for naming indexes](#requirements-for-naming-indexes) for the rest of the name.
+1. Prefix the index name with `tmp_` and follow the [naming conventions](database/constraint_naming_convention.md)
+   and [requirements for naming indexes](#requirements-for-naming-indexes) for the rest of the name.
 1. Create a follow-up issue to remove the index in the next (or future) milestone.
 1. Add a comment in the migration mentioning the removal issue.
 
@@ -237,10 +228,10 @@ on GitLab.com, the deployment process is blocked waiting for index
 creation to finish.
 
 To limit impact on GitLab.com, a process exists to create indexes
-asynchronously during weekend hours. Due to generally lower levels of
-traffic and lack of regular deployments, this process allows the
-creation of indexes to proceed with a lower level of risk. The below
-sections describe the steps required to use these features:
+asynchronously during weekend hours. Due to generally lower traffic and fewer deployments,
+index creation can proceed at a lower level of risk.
+
+### Schedule index creation for a low-impact time
 
 1. [Schedule the index to be created](#schedule-the-index-to-be-created).
 1. [Verify the MR was deployed and the index exists in production](#verify-the-mr-was-deployed-and-the-index-exists-in-production).
@@ -291,12 +282,10 @@ migration as expected for other installations. The below block
 demonstrates how to create the second migration for the previous
 asynchronous example.
 
-WARNING:
-The responsibility lies on the individual writing the migrations to verify
-the index exists in production before merging a second migration that
-adds the index using `add_concurrent_index`. If the second migration is
-deployed and the index has not yet been created, the index is created
-synchronously when the second migration executes.
+**WARNING:**
+Verify that the index exists in production before merging a second migration with `add_concurrent_index`.
+If the second migration is deployed before the index has been created,
+the index is created synchronously when the second migration executes.
 
 ```ruby
 # in db/post_migrate/
diff --git a/doc/development/cicd/pipeline_wizard.md b/doc/development/cicd/pipeline_wizard.md
index 608c21778c0..26b95dd3630 100644
--- a/doc/development/cicd/pipeline_wizard.md
+++ b/doc/development/cicd/pipeline_wizard.md
@@ -227,3 +227,21 @@ Use as `widget: list`. This inserts a `list` in the YAML file.
 | `invalidFeedback` | **{dotted-circle}** No | string  | Help text displayed when the pattern validation fails. |
 | `default`         | **{dotted-circle}** No | list    | The default value for the list |
 | `id`              | **{dotted-circle}** No | string  | The input field ID is usually autogenerated but can be overridden by providing this property. |
+
+#### Checklist
+
+Use as `widget: checklist`. This inserts a list of checkboxes that need to 
+be checked before proceeding to the next step.
+
+| Name    | Required               | Type   | Description                                                                                                                                                                        |
+|---------|------------------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `title` | **{dotted-circle}** No | string | A title above the checklist items.                                                                                                                                                  |
+| `items` | **{dotted-circle}** No | list   | A list of items that need to be checked. Each item corresponds to one checkbox, and can be a string or [checklist item](#checklist-item). |
+
+##### Checklist Item
+
+| Name   | Required               | Type    | Description                             |
+|--------|------------------------|---------|-----------------------------------------|
+| `text` | **{check-circle}** Yes | string  | A title above the checklist items.      |
+| `help` | **{dotted-circle}** No | string  | Help text explaining the item.          |
+| `id`   | **{dotted-circle}** No | string  | The input field ID is usually autogenerated but can be overridden by providing this property. |
diff --git a/doc/development/database/add_foreign_key_to_existing_column.md b/doc/development/database/add_foreign_key_to_existing_column.md
index 9842814816f..7a18da2223f 100644
--- a/doc/development/database/add_foreign_key_to_existing_column.md
+++ b/doc/development/database/add_foreign_key_to_existing_column.md
@@ -64,18 +64,14 @@ emails = Email.where(user_id: 1) # returns emails for the deleted user
 
 Add a `NOT VALID` foreign key constraint to the table, which enforces consistency on the record changes.
 
-[Using the `with_lock_retries` helper method is advised when performing operations on high-traffic tables](../migration_style_guide.md#when-to-use-the-helper-method),
-in this case, if the table or the foreign table is a high-traffic table, we should use the helper method.
-
 In the example above, you'd be still able to update records in the `emails` table. However, when you'd try to update the `user_id` with non-existent value, the constraint causes a database error.
 
 Migration file for adding `NOT VALID` foreign key:
 
 ```ruby
-class AddNotValidForeignKeyToEmailsUser < Gitlab::Database::Migration[1.0]
+class AddNotValidForeignKeyToEmailsUser < Gitlab::Database::Migration[2.0]
   def up
-    # safe to use: it requires short lock on the table since we don't validate the foreign key
-    add_foreign_key :emails, :users, on_delete: :cascade, validate: false
+    add_concurrent_foreign_key :emails, :users, on_delete: :cascade, validate: false
   end
 
   def down
@@ -84,8 +80,14 @@ class AddNotValidForeignKeyToEmailsUser < Gitlab::Database::Migration[1.0]
 end
 ```
 
+Adding a foreign key without validating it is a fast operation. It only requires a
+short lock on the table before being able to enforce the constraint on new data.
+We do still want to enable lock retries for high traffic and large tables.
+`add_concurrent_foreign_key` does this for us, and also checks if the foreign key already exists.
+
 WARNING:
-Avoid using the `add_foreign_key` constraint more than once per migration file, unless the source and target tables are identical.
+Avoid using `add_foreign_key` or `add_concurrent_foreign_key` constraints more than
+once per migration file, unless the source and target tables are identical.
 
 #### Data migration to fix existing records
 
@@ -98,7 +100,7 @@ In case the data volume is higher (>1000 records), it's better to create a backg
 Example for cleaning up records in the `emails` table in a database migration:
 
 ```ruby
-class RemoveRecordsWithoutUserFromEmailsTable < Gitlab::Database::Migration[1.0]
+class RemoveRecordsWithoutUserFromEmailsTable < Gitlab::Database::Migration[2.0]
   disable_ddl_transaction!
 
   class Email < ActiveRecord::Base
@@ -121,6 +123,7 @@ end
 ### Validate the foreign key
 
 Validating the foreign key scans the whole table and makes sure that each relation is correct.
+Fortunately, this does not lock the source table (`users`) while running.
 
 NOTE:
 When using [background migrations](background_migrations.md), foreign key validation should happen in the next GitLab release.
@@ -130,7 +133,7 @@ Migration file for validating the foreign key:
 ```ruby
 # frozen_string_literal: true
 
-class ValidateForeignKeyOnEmailUsers < Gitlab::Database::Migration[1.0]
+class ValidateForeignKeyOnEmailUsers < Gitlab::Database::Migration[2.0]
   def up
     validate_foreign_key :emails, :user_id
   end
diff --git a/doc/development/foreign_keys.md b/doc/development/foreign_keys.md
index 77df6fbfb0d..e0dd0fe8e7c 100644
--- a/doc/development/foreign_keys.md
+++ b/doc/development/foreign_keys.md
@@ -28,9 +28,80 @@ Guide](migration_style_guide.md) for more information.
 
 Keep in mind that you can only safely add foreign keys to existing tables after
 you have removed any orphaned rows. The method `add_concurrent_foreign_key`
-does not take care of this so you need to do so manually. See
+does not take care of this so you must do so manually. See
 [adding foreign key constraint to an existing column](database/add_foreign_key_to_existing_column.md).
 
+## Updating Foreign Keys In Migrations
+
+Sometimes a foreign key constraint must be changed, preserving the column
+but updating the constraint condition. For example, moving from
+`ON DELETE CASCADE` to `ON DELETE SET NULL` or vice-versa.
+
+PostgreSQL does not prevent you from adding overlapping foreign keys. It
+honors the most recently added constraint. This allows us to replace foreign keys without
+ever losing foreign key protection on a column.
+
+To replace a foreign key:
+
+1. [Add the new foreign key without validation](database/add_foreign_key_to_existing_column.md#prevent-invalid-records)
+
+  The name of the foreign key constraint must be changed to add a new
+  foreign key before removing the old one.
+
+  ```ruby
+  class ReplaceFkOnPackagesPackagesProjectId < Gitlab::Database::Migration[2.0]
+    disable_ddl_transaction!
+
+    NEW_CONSTRAINT_NAME = 'fk_new'
+
+    def up
+      add_concurrent_foreign_key(:packages_packages, :projects, column: :project_id, on_delete: :nullify, validate: false, name: NEW_CONSTRAINT_NAME)
+    end
+
+    def down
+      with_lock_retries do
+        remove_foreign_key_if_exists(:packages_packages, column: :project_id, on_delete: :nullify, name: NEW_CONSTRAINT_NAME)
+      end
+    end
+  end
+  ```
+
+1. [Validate the new foreign key](database/add_foreign_key_to_existing_column.md#validate-the-foreign-key)
+
+  ```ruby
+  class ValidateFkNew < Gitlab::Database::Migration[2.0]
+    NEW_CONSTRAINT_NAME = 'fk_new'
+
+    # foreign key added in <link to MR or path to migration adding new FK>
+    def up
+      validate_foreign_key(:packages_packages, name: NEW_CONSTRAINT_NAME)
+    end
+
+    def down
+      # no-op
+    end
+  end
+  ```
+
+1. Remove the old foreign key:
+
+  ```ruby
+  class RemoveFkOld < Gitlab::Database::Migration[2.0]
+    OLD_CONSTRAINT_NAME = 'fk_old'
+
+    # new foreign key added in <link to MR or path to migration adding new FK>
+    # and validated in <link to MR or path to migration validating new FK>
+    def up
+      remove_foreign_key_if_exists(:packages_packages, column: :project_id, on_delete: :cascade, name: OLD_CONSTRAINT_NAME)
+    end
+
+    def down
+      # Validation is skipped here, so if rolled back, this will need to be revalidated in a separate migration
+      add_concurrent_foreign_key(:packages_packages, :projects, column: :project_id, on_delete: :cascade, validate: false, name: OLD_CONSTRAINT_NAME)
+    end
+  end
+  ```
+
 ## Cascading Deletes
 
 Every foreign key must define an `ON DELETE` clause, and in 99% of the cases
diff --git a/doc/development/gitlab_flavored_markdown/specification_guide/index.md b/doc/development/gitlab_flavored_markdown/specification_guide/index.md
index cedf44cf1fc..94bad86080b 100644
--- a/doc/development/gitlab_flavored_markdown/specification_guide/index.md
+++ b/doc/development/gitlab_flavored_markdown/specification_guide/index.md
@@ -69,11 +69,11 @@ serve as input to automated conformance tests. It is
 > This document attempts to specify Markdown syntax unambiguously. It contains many
 > examples with side-by-side Markdown and HTML. These examples are intended to double as conformance tests.
 
-The HTML-rendered versions of the specifications:
+Here are the HTML-rendered versions of the specifications:
 
 - [GitLab Flavored Markdown (GLFM) specification](https://gitlab.com/gitlab-org/gitlab/-/blob/master/glfm_specification/output/spec.html), which extends the:
-- [GitHub Flavored Markdown (GFM) specification](https://github.github.com/gfm/), which extends the:
-- [CommonMark specification](https://spec.commonmark.org/0.30/)
+- [GitHub Flavored Markdown (GFM) specification](https://github.github.com/gfm/) (rendered from the [source `spec.txt` for GFM specification](https://github.com/github/cmark-gfm/blob/master/test/spec.txt)), which extends the:
+- [CommonMark specification](https://spec.commonmark.org/0.30/) (rendered from the [source `spec.txt` for CommonMark specification](https://github.com/commonmark/commonmark-spec/blob/master/spec.txt))
 
 NOTE:
 The creation of the
diff --git a/doc/development/import_project.md b/doc/development/import_project.md
index e910983997c..aa767cd5b4c 100644
--- a/doc/development/import_project.md
+++ b/doc/development/import_project.md
@@ -133,6 +133,41 @@ During import, the tarball is cached in your configured `shared_path` directory.
 disk has enough free space to accommodate both the cached tarball and the unpacked
 project files on disk.
 
+##### Import is successful, but with a `Total number of not imported relations: XX` message, and issues are not created during the import
+
+If you receive a `Total number of not imported relations: XX` message, and issues
+aren't created during the import, check [exceptions_json.log](../administration/logs.md#exceptions_jsonlog).
+You might see an error like `N is out of range for ActiveModel::Type::Integer with limit 4 bytes`,
+where `N` is the integer exceeding the 4-byte integer limit. If that's the case, you
+are likely hitting the issue with rebalancing of `relative_position` field of the issues.
+
+The feature flag to enable the rebalance automatically was enabled on GitLab.com.
+We intend to enable it by default on self-managed instances when the issue
+[Rebalance issues FF rollout](https://gitlab.com/gitlab-org/gitlab/-/issues/343368)
+is implemented.
+
+If the feature is not enabled by default on your GitLab version, run the following
+commands in the [Rails console](../administration/operations/rails_console.md) as
+a workaround. Replace the ID with the ID of your project you were trying to import:
+
+```ruby
+# Check if the feature is enabled on your instance. If it is, rebalance should work automatically on your instance
+Feature.enabled?(:rebalance_issues,Project.find(ID).root_namespace)
+
+# Check the current maximum value of relative_position
+Issue.where(project_id: Project.find(ID).root_namespace.all_projects).maximum(:relative_position)
+
+# Enable `rebalance_issues` feauture and check that it was successfully enabled
+Feature.enable(:rebalance_issues,Project.find(ID).root_namespace)
+Feature.enabled?(:rebalance_issues,Project.find(ID).root_namespace)
+
+# Run the rebalancing process and check if the maximum value of relative_position has changed
+Issues::RelativePositionRebalancingService.new(Project.find(ID).root_namespace.all_projects).execute
+Issue.where(project_id: Project.find(ID).root_namespace.all_projects).maximum(:relative_position)
+```
+
+Repeat the import attempt after that and check if the issues are imported successfully.
+
 ### Importing via the Rails console
 
 The last option is to import a project using a Rails console:
author	GitLab Bot <gitlab-bot@gitlab.com>	2022-06-21 03:08:43 +0300
committer	GitLab Bot <gitlab-bot@gitlab.com>	2022-06-21 03:08:43 +0300
commit	92ea86691a2a6b3df4b36c7ff00001410303a701 (patch)
tree	9b3764b56303c9b65e17007c589a297775834b28 /doc/development
parent	991c66333dc7bdb0fd6f7a0b7f7bdf8383285975 (diff)