Welcome to mirror list, hosted at ThFree Co, Russian Federation.

0003-connected-do-not-sort-input-revisions.patch « git-patches « _support - gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: 6f4f9d477dcf1a61595072aadd51e996da1f6b93 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
From f45022dc2fd692fd024f2eb41a86a66f19013d43 Mon Sep 17 00:00:00 2001
Message-Id: <f45022dc2fd692fd024f2eb41a86a66f19013d43.1630319075.git.ps@pks.im>
In-Reply-To: <29ef1f27fed21b5b7d3c996a01f1364e7e841917.1630319075.git.ps@pks.im>
References: <29ef1f27fed21b5b7d3c996a01f1364e7e841917.1630319075.git.ps@pks.im>
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 9 Aug 2021 10:11:50 +0200
Subject: [PATCH 3/6] connected: do not sort input revisions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In order to compute whether objects reachable from a set of tips are all
connected, we do a revision walk with these tips as positive references
and `--not --all`. `--not --all` will cause the revision walk to load
all preexisting references as uninteresting, which can be very expensive
in repositories with many references.

Benchmarking the git-rev-list(1) command highlights that by far the most
expensive single phase is initial sorting of the input revisions: after
all references have been loaded, we first sort commits by author date.
In a real-world repository with about 2.2 million references, it makes
up about 40% of the total runtime of git-rev-list(1).

Ultimately, the connectivity check shouldn't really bother about the
order of input revisions at all. We only care whether we can actually
walk all objects until we hit the cut-off point. So sorting the input is
a complete waste of time.

Introduce a new "--unsorted-input" flag to git-rev-list(1) which will
cause it to not sort the commits and adjust the connectivity check to
always pass the flag. This results in the following speedups, executed
in a clone of gitlab-org/gitlab [1]:

    Benchmark #1: git rev-list  --objects --quiet --not --all --not $(cat newrev)
      Time (mean ± σ):      7.639 s ±  0.065 s    [User: 7.304 s, System: 0.335 s]
      Range (min … max):    7.543 s …  7.742 s    10 runs

    Benchmark #2: git rev-list --unsorted-input --objects --quiet --not --all --not $newrev
      Time (mean ± σ):      4.995 s ±  0.044 s    [User: 4.657 s, System: 0.337 s]
      Range (min … max):    4.909 s …  5.048 s    10 runs

    Summary
      'git rev-list --unsorted-input --objects --quiet --not --all --not $(cat newrev)' ran
        1.53 ± 0.02 times faster than 'git rev-list  --objects --quiet --not --all --not $newrev'

[1]: https://gitlab.com/gitlab-org/gitlab.git. Note that not all refs
     are visible to clients.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/rev-list-options.txt |  8 +++++++-
 connected.c                        |  1 +
 revision.c                         |  9 +++++++++
 t/t6000-rev-list-misc.sh           | 31 ++++++++++++++++++++++++++++++
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 24569b06d1..b7bd27e171 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -968,6 +968,11 @@ list of the missing objects.  Object IDs are prefixed with a ``?'' character.
 	objects.
 endif::git-rev-list[]
 
+--unsorted-input::
+	Show commits in the order they were given on the command line instead
+	of sorting them in reverse chronological order by commit time. Cannot
+	be combined with `--no-walk` or `--no-walk=sorted`.
+
 --no-walk[=(sorted|unsorted)]::
 	Only show the given commits, but do not traverse their ancestors.
 	This has no effect if a range is specified. If the argument
@@ -975,7 +980,8 @@ endif::git-rev-list[]
 	given on the command line. Otherwise (if `sorted` or no argument
 	was given), the commits are shown in reverse chronological order
 	by commit time.
-	Cannot be combined with `--graph`.
+	Cannot be combined with `--graph`. Cannot be combined with
+	`--unsorted-input` if `sorted` or no argument was given.
 
 --do-walk::
 	Overrides a previous `--no-walk`.
diff --git a/connected.c b/connected.c
index b18299fdf0..b5f9523a5f 100644
--- a/connected.c
+++ b/connected.c
@@ -106,6 +106,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 	if (opt->progress)
 		strvec_pushf(&rev_list.args, "--progress=%s",
 			     _("Checking connectivity"));
+	strvec_push(&rev_list.args, "--unsorted-input");
 
 	rev_list.git_cmd = 1;
 	rev_list.env = opt->env;
diff --git a/revision.c b/revision.c
index 86bbcd10d2..47541407d2 100644
--- a/revision.c
+++ b/revision.c
@@ -2256,6 +2256,10 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--author-date-order")) {
 		revs->sort_order = REV_SORT_BY_AUTHOR_DATE;
 		revs->topo_order = 1;
+	} else if (!strcmp(arg, "--unsorted-input")) {
+		if (revs->no_walk)
+			die(_("--unsorted-input is incompatible with --no-walk"));
+		revs->unsorted_input = 1;
 	} else if (!strcmp(arg, "--early-output")) {
 		revs->early_output = 100;
 		revs->topo_order = 1;
@@ -2651,8 +2655,13 @@ static int handle_revision_pseudo_opt(const char *submodule,
 	} else if (!strcmp(arg, "--not")) {
 		*flags ^= UNINTERESTING | BOTTOM;
 	} else if (!strcmp(arg, "--no-walk")) {
+		if (!revs->no_walk && revs->unsorted_input)
+			die(_("--no-walk is incompatible with --unsorted-input"));
 		revs->no_walk = 1;
 	} else if (skip_prefix(arg, "--no-walk=", &optarg)) {
+		if (!revs->no_walk && revs->unsorted_input)
+			die(_("--no-walk is incompatible with --unsorted-input"));
+
 		/*
 		 * Detached form ("--no-walk X" as opposed to "--no-walk=X")
 		 * not allowed, since the argument is optional.
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index 12def7bcbf..ef849e5bc8 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -169,4 +169,35 @@ test_expect_success 'rev-list --count --objects' '
 	test_line_count = $count actual
 '
 
+test_expect_success 'rev-list --unsorted-input results in different sorting' '
+	git rev-list --unsorted-input HEAD HEAD~ >first &&
+	git rev-list --unsorted-input HEAD~ HEAD >second &&
+	! test_cmp first second &&
+	sort first >first.sorted &&
+	sort second >second.sorted &&
+	test_cmp first.sorted second.sorted
+'
+
+test_expect_success 'rev-list --unsorted-input incompatible with --no-walk' '
+	cat >expect <<-EOF &&
+		fatal: --no-walk is incompatible with --unsorted-input
+	EOF
+	test_must_fail git rev-list --unsorted-input --no-walk HEAD 2>error &&
+	test_cmp expect error &&
+	test_must_fail git rev-list --unsorted-input --no-walk=sorted HEAD 2>error &&
+	test_cmp expect error &&
+	test_must_fail git rev-list --unsorted-input --no-walk=unsorted HEAD 2>error &&
+	test_cmp expect error &&
+
+	cat >expect <<-EOF &&
+		fatal: --unsorted-input is incompatible with --no-walk
+	EOF
+	test_must_fail git rev-list --no-walk --unsorted-input HEAD 2>error &&
+	test_cmp expect error &&
+	test_must_fail git rev-list --no-walk=sorted --unsorted-input HEAD 2>error &&
+	test_cmp expect error &&
+	test_must_fail git rev-list --no-walk=unsorted --unsorted-input HEAD 2>error &&
+	test_cmp expect error
+'
+
 test_done
-- 
2.33.0