Add latest changes from gitlab-org/gitlab@master

author: GitLab Bot <gitlab-bot@gitlab.com> 2021-10-07 12:12:01 +0300
committer: GitLab Bot <gitlab-bot@gitlab.com> 2021-10-07 12:12:01 +0300
commit: bc935f05bc8d7dd89c3e7c88f90264e90b636e07 (patch)
tree: 8f2085390922fdf604e6bee88b4d4e08000fe154 /doc/development/database
parent: c2f9cac32e8141a9cd909ee654580d7472c531a0 (diff)
1 files changed, 32 insertions, 1 deletions
diff --git a/doc/development/database/efficient_in_operator_queries.md b/doc/development/database/efficient_in_operator_queries.md
index bc72bce30bf..0e979534acd 100644
--- a/doc/development/database/efficient_in_operator_queries.md
+++ b/doc/development/database/efficient_in_operator_queries.md
@@ -226,7 +226,12 @@ Gitlab::Pagination::Keyset::InOperatorOptimization::QueryBuilder.new(
 - `finder_query` loads the actual record row from the database. It must also be a lambda, where
   the order by column expressions is available for locating the record. In this example, the
   yielded values are `created_at` and `id` SQL expressions. Finding a record is very fast via the
-  primary key, so we don't use the `created_at` value.
+  primary key, so we don't use the `created_at` value. Providing the `finder_query` lambda is optional.
+  If it's not given, the IN operator optimization will only make the ORDER BY columns available to
+  the end-user and not the full database row.
+
+  If it's not given, the IN operator optimization will only make the ORDER BY columns available to
+  the end-user and not the full database row.
 
 The following database index on the `issues` table must be present
 to make the query execute efficiently:
@@ -611,6 +616,32 @@ Gitlab::Pagination::Keyset::Iterator.new(scope: scope, **opts).each_batch(of: 10
 end
 ```
 
+NOTE:
+The query loads complete database rows from the disk. This may cause increased I/O and slower
+database queries. Depending on the use case, the primary key is often only
+needed for the batch query to invoke additional statements. For example, `UPDATE` or `DELETE`. The
+`id` column is included in the `ORDER BY` columns (`created_at` and `id`) and is already
+loaded. In this case, you can omit the `finder_query` parameter.
+
+Example for loading the `ORDER BY` columns only:
+
+```ruby
+scope = Issue.order(:created_at, :id)
+array_scope = Group.find(9970).all_projects.select(:id)
+array_mapping_scope = -> (id_expression) { Issue.where(Issue.arel_table[:project_id].eq(id_expression)) }
+
+opts = {
+  in_operator_optimization_options: {
+    array_scope: array_scope,
+    array_mapping_scope: array_mapping_scope
+  }
+}
+
+Gitlab::Pagination::Keyset::Iterator.new(scope: scope, **opts).each_batch(of: 100) do |records|
+  puts records.select(:id).map { |r| [r.id] } # only id and created_at are available
+end
+```
+
 #### Keyset pagination
 
 The optimization works out of the box with GraphQL and the `keyset_paginate` helper method.
author	GitLab Bot <gitlab-bot@gitlab.com>	2021-10-07 12:12:01 +0300
committer	GitLab Bot <gitlab-bot@gitlab.com>	2021-10-07 12:12:01 +0300
commit	bc935f05bc8d7dd89c3e7c88f90264e90b636e07 (patch)
tree	8f2085390922fdf604e6bee88b4d4e08000fe154 /doc/development/database
parent	c2f9cac32e8141a9cd909ee654580d7472c531a0 (diff)