Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPatrick Steinhardt <ps@pks.im>2023-10-31 10:16:13 +0300
committerJunio C Hamano <gitster@pobox.com>2023-11-01 06:04:06 +0300
commite04838ea828651cc122de505320e5ea85b43f1b1 (patch)
treecc821cf2ab68dd91e07c9ee4dc4d3238ba540f66 /commit-graph.c
parent43c8a30d150ecede9709c1f2527c8fba92c65f40 (diff)
commit-graph: introduce envvar to disable commit existence checks
Our `lookup_commit_in_graph()` helper tries to look up commits from the commit graph and, if it doesn't exist there, falls back to parsing it from the object database instead. This is intended to speed up the lookup of any such commit that exists in the database. There is an edge case though where the commit exists in the graph, but not in the object database. To avoid returning such stale commits the helper function thus double checks that any such commit parsed from the graph also exists in the object database. This makes the function safe to use even when commit graphs aren't updated regularly. We're about to introduce the same pattern into other parts of our code base though, namely `repo_parse_commit_internal()`. Here the extra sanity check is a bit of a tougher sell: `lookup_commit_in_graph()` was a newly introduced helper, and as such there was no performance hit by adding this sanity check. If we added `repo_parse_commit_internal()` with that sanity check right from the beginning as well, this would probably never have been an issue to begin with. But by retrofitting it with this sanity check now we do add a performance regression to preexisting code, and thus there is a desire to avoid this or at least give an escape hatch. In practice, there is no inherent reason why either of those functions should have the sanity check whereas the other one does not: either both of them are able to detect this issue or none of them should be. This also means that the default of whether we do the check should likely be the same for both. To err on the side of caution, we thus rather want to make `repo_parse_commit_internal()` stricter than to loosen the checks that we already have in `lookup_commit_in_graph()`. The escape hatch is added in the form of a new GIT_COMMIT_GRAPH_PARANOIA environment variable that mirrors GIT_REF_PARANOIA. If enabled, which is the default, we will double check that commits looked up in the commit graph via `lookup_commit_in_graph()` also exist in the object database. This same check will also be added in `repo_parse_commit_internal()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'commit-graph.c')
-rw-r--r--commit-graph.c6
1 files changed, 5 insertions, 1 deletions
diff --git a/commit-graph.c b/commit-graph.c
index 0aa1640d15..b37fdcb214 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -907,14 +907,18 @@ int repo_find_commit_pos_in_graph(struct repository *r, struct commit *c,
struct commit *lookup_commit_in_graph(struct repository *repo, const struct object_id *id)
{
+ static int commit_graph_paranoia = -1;
struct commit *commit;
uint32_t pos;
+ if (commit_graph_paranoia == -1)
+ commit_graph_paranoia = git_env_bool(GIT_COMMIT_GRAPH_PARANOIA, 1);
+
if (!prepare_commit_graph(repo))
return NULL;
if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
return NULL;
- if (!has_object(repo, id, 0))
+ if (commit_graph_paranoia && !has_object(repo, id, 0))
return NULL;
commit = lookup_commit(repo, id);