diff options
author | Patrick Steinhardt <psteinhardt@gitlab.com> | 2022-02-23 15:05:30 +0300 |
---|---|---|
committer | Patrick Steinhardt <psteinhardt@gitlab.com> | 2022-02-28 11:05:24 +0300 |
commit | 769d999acbc6e3d2362ed68f04099108590cb800 (patch) | |
tree | c2e7542514fa57c0092bc8695f578222922b79b4 | |
parent | b6e1f3ce3799d61cb7cdbd67952d82c126f44c4f (diff) |
git: Skip checks whether a fetch is updating submodules
When fetching into a repository, Git will by default check whether the
fetch brings in any commits which update a submodule. If so, and if that
submodule is populated, then we'll recurse into that submodule and also
perform a fetch in there. This is useless though in our context because
we never populate submodules, so computing whether there are commits
which update any submodule is a complete waste of time.
The following mirror-fetch has been executed in www-gitlab-com:
Benchmark 1: git fetch --recurse-submodules=on-demand +refs/*:refs/*
Time (mean ± σ): 66.595 s ± 1.396 s [User: 63.019 s, System: 8.729 s]
Range (min … max): 65.377 s … 68.118 s 3 runs
Benchmark 2: git fetch --recurse-submodules=no +refs/*:refs/*
Time (mean ± σ): 62.789 s ± 1.202 s [User: 61.434 s, System: 7.774 s]
Range (min … max): 61.621 s … 64.022 s 3 runs
Summary
'git fetch --recurse-submodules=no +refs/*:refs/*' ran
1.06 ± 0.03 times faster than 'git fetch --recurse-submodules=on-demand +refs/*:refs/*'
This demonstrates that we get a nice small speedup by explicitly
disabling the check.
-rw-r--r-- | internal/git/command_description.go | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/internal/git/command_description.go b/internal/git/command_description.go index 3efad9c5c..cb26d08d5 100644 --- a/internal/git/command_description.go +++ b/internal/git/command_description.go @@ -98,6 +98,15 @@ var commandDescriptions = map[string]commandDescription{ // us and unreachable from the outside, this is dangerous. We thus have to // disable redirects in all cases. ConfigPair{Key: "http.followRedirects", Value: "false"}, + + // By default, Git will try to recurse into submodules on demand: if a fetch + // retrieves a commit that updates a populated submodule, then it recurses + // into that submodule and also updates it. Computing this condition takes + // some resources though given that we need to check all fetched commits to + // find out if any submodule was in fact updated. This is a complete waste + // of time though because we never populate submodules at all. We thus + // disable recursion into submodules. + ConfigPair{Key: "fetch.recurseSubmodules", Value: "no"}, }, fsckConfiguration("fetch")...), }, "for-each-ref": { |