Add latest changes from gitlab-org/gitlab@14-7-stable-eev14.7.0-rc42

author: GitLab Bot <gitlab-bot@gitlab.com> 2022-01-20 12:16:11 +0300
committer: GitLab Bot <gitlab-bot@gitlab.com> 2022-01-20 12:16:11 +0300
commit: edaa33dee2ff2f7ea3fac488d41558eb5f86d68c (patch)
tree: 11f143effbfeba52329fb7afbd05e6e2a3790241 /doc/development/database
parent: d8a5691316400a0f7ec4f83832698f1988eb27c1 (diff)
2 files changed, 106 insertions, 8 deletions
diff --git a/doc/development/database/efficient_in_operator_queries.md b/doc/development/database/efficient_in_operator_queries.md
index 1e706890f64..76518a6f5e2 100644
--- a/doc/development/database/efficient_in_operator_queries.md
+++ b/doc/development/database/efficient_in_operator_queries.md
@@ -589,6 +589,87 @@ LIMIT 20
 NOTE:
 To make the query efficient, the following columns need to be covered with an index: `project_id`, `issue_type`, `created_at`, and `id`.
 
+#### Using calculated ORDER BY expression
+
+The following example orders epic records by the duration between the creation time and closed
+time. It is calculated with the following formula:
+
+```sql
+SELECT EXTRACT('epoch' FROM epics.closed_at - epics.created_at) FROM epics
+```
+
+The query above returns the duration in seconds (`double precision`) between the two timestamp
+columns in seconds. To order the records by this expression, you must reference it
+in the `ORDER BY` clause:
+
+```sql
+SELECT EXTRACT('epoch' FROM epics.closed_at - epics.created_at)
+FROM epics
+ORDER BY EXTRACT('epoch' FROM epics.closed_at - epics.created_at) DESC
+```
+
+To make this ordering efficient on the group-level with the in-operator optimization, use a
+custom `ORDER BY` configuration. Since the duration is not a distinct value (no unique index
+present), you must add a tie-breaker column (`id`).
+
+The following example shows the final `ORDER BY` clause:
+
+```sql
+ORDER BY extract('epoch' FROM epics.closed_at - epics.created_at) DESC, epics.id DESC
+```
+
+Snippet for loading records ordered by the calcualted duration:
+
+```ruby
+arel_table =  Epic.arel_table
+order = Gitlab::Pagination::Keyset::Order.build([
+  Gitlab::Pagination::Keyset::ColumnOrderDefinition.new(
+    attribute_name: 'duration_in_seconds',
+    order_expression: Arel.sql('EXTRACT(EPOCH FROM epics.closed_at - epics.created_at)').desc,
+    distinct: false,
+    sql_type: 'double precision' # important for calculated SQL expressions
+  ),
+  Gitlab::Pagination::Keyset::ColumnOrderDefinition.new(
+    attribute_name: 'id',
+    order_expression: arel_table[:id].desc
+  )
+])
+
+records = Gitlab::Pagination::Keyset::InOperatorOptimization::QueryBuilder.new(
+  scope: Epic.where.not(closed_at: nil).reorder(order), # filter out NULL values
+  array_scope: Group.find(9970).self_and_descendants.select(:id),
+  array_mapping_scope: -> (id_expression) { Epic.where(Epic.arel_table[:group_id].eq(id_expression)) }
+).execute.limit(20)
+
+puts records.pluck(:duration_in_seconds, :id) # other columnns are not available
+```
+
+Building the query requires quite a bit of configuration. For the order configuration you
+can find more information within the
+[complex order configuration](keyset_pagination.md#complex-order-configuration)
+section for keyset paginated database queries.
+
+The query requires a specialized database index:
+
+```sql
+CREATE INDEX index_epics_on_duration ON epics USING btree (group_id, EXTRACT(EPOCH FROM epics.closed_at - epics.created_at) DESC, id DESC) WHERE (closed_at IS NOT NULL);
+```
+
+Notice that the `finder_query` parameter is not used. The query only returns the `ORDER BY` columns
+which are the `duration_in_seconds` (calculated column) and the `id` columns. This is a limitation
+of the feature, defining the `finder_query` with calculated `ORDER BY` expressions is not supported.
+To get the complete database records, an extra query can be invoked by the returned `id` column:
+
+```ruby
+records_by_id = records.index_by(&:id)
+complete_records = Epic.where(id: records_by_id.keys).index_by(&:id)
+
+# Printing the complete records according to the `ORDER BY` clause
+records_by_id.each do |id, _|
+  puts complete_records[id].attributes
+end
+```
+
 #### Batch iteration
 
 Batch iteration over the records is possible via the keyset `Iterator` class.
diff --git a/doc/development/database/loose_foreign_keys.md b/doc/development/database/loose_foreign_keys.md
index c60989f225d..97d150b1a18 100644
--- a/doc/development/database/loose_foreign_keys.md
+++ b/doc/development/database/loose_foreign_keys.md
@@ -43,7 +43,7 @@ we can:
 
 1. Create a `DELETE` trigger on the `projects` table.
    Record the deletions in a separate table (`deleted_records`).
-1. A job checks the `deleted_records` table every 5 minutes.
+1. A job checks the `deleted_records` table every minute or two.
 1. For each record in the table, delete the associated `ci_pipelines` records
    using the `project_id` column.
 
@@ -89,9 +89,10 @@ ci_pipelines:
 
 ### Track record changes
 
-To know about deletions in the `projects` table, configure a `DELETE` trigger using a database
-migration (post-migration). The trigger needs to be configured only once. If the model already has
-at least one `loose_foreign_key` definition, then this step can be skipped:
+To know about deletions in the `projects` table, configure a `DELETE` trigger
+using a [post-deployment migration](../post_deployment_migrations.md). The
+trigger needs to be configured only once. If the model already has at least one
+`loose_foreign_key` definition, then this step can be skipped:
 
 ```ruby
 class TrackProjectRecordChanges < Gitlab::Database::Migration[1.0]
@@ -122,15 +123,20 @@ REFERENCES projects(id)
 ON DELETE CASCADE;
 ```
 
-The migration should run after the `DELETE` trigger is installed. If the foreign key is deleted
-earlier, there is a good chance of introducing data inconsistency which needs manual cleanup:
+The migration must run after the `DELETE` trigger is installed and the loose
+foreign key definition is deployed. As such, it must be a [post-deployment
+migration](../post_deployment_migrations.md) dated after the migration for the
+trigger. If the foreign key is deleted earlier, there is a good chance of
+introducing data inconsistency which needs manual cleanup:
 
 ```ruby
 class RemoveProjectsCiPipelineFk < Gitlab::Database::Migration[1.0]
-  enable_lock_retries!
+  disable_ddl_transaction!
 
   def up
-    remove_foreign_key_if_exists(:ci_pipelines, :projects, name: "fk_86635dbd80")
+    with_lock_retries do
+      remove_foreign_key_if_exists(:ci_pipelines, :projects, name: "fk_86635dbd80")
+    end
   end
 
   def down
@@ -155,6 +161,17 @@ it_behaves_like 'it has loose foreign keys' do
 end
 ```
 
+**After** [removing a foreign key](#remove-the-foreign-key),
+use the "`cleanup by a loose foreign key`" shared example to test a child record's deletion or nullification
+via the added loose foreign key:
+
+```ruby
+it_behaves_like 'cleanup by a loose foreign key' do
+  let!(:model) { create(:ci_pipeline, user: create(:user)) }
+  let!(:parent) { model.user }
+end
+```
+
 ## Caveats of loose foreign keys
 
 ### Record creation
author	GitLab Bot <gitlab-bot@gitlab.com>	2022-01-20 12:16:11 +0300
committer	GitLab Bot <gitlab-bot@gitlab.com>	2022-01-20 12:16:11 +0300
commit	edaa33dee2ff2f7ea3fac488d41558eb5f86d68c (patch)
tree	11f143effbfeba52329fb7afbd05e6e2a3790241 /doc/development/database
parent	d8a5691316400a0f7ec4f83832698f1988eb27c1 (diff)