Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-04-25Merge branch 'en/header-split-cache-h'Junio C Hamano
Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...
2023-04-11treewide: remove cache.h inclusion due to object-file.h changesElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11object-file.h: move declarations for object-file.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on oid-array.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06Merge branch 'en/header-split-cleanup'Junio C Hamano
Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06Merge branch 'ab/remove-implicit-use-of-the-repository'Junio C Hamano
Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-06Merge branch 'ps/ahead-behind-truncation-fix'Junio C Hamano
Fix unnecessary truncation of generation numbers used in-core. * ps/ahead-behind-truncation-fix: commit-graph: fix truncated generation numbers
2023-04-06Merge branch 'ds/ahead-behind'Junio C Hamano
"git for-each-ref" learns '%(ahead-behind:<base>)' that computes the distances from a single reference point in the history with bunch of commits in bulk. * ds/ahead-behind: commit-reach: add tips_reachable_from_bases() for-each-ref: add ahead-behind format atom commit-reach: implement ahead_behind() logic commit-graph: introduce `ensure_generations_valid()` commit-graph: return generation from memory commit-graph: simplify compute_generation_numbers() commit-graph: refactor compute_topological_levels() for-each-ref: explicitly test no matches for-each-ref: add --stdin option
2023-04-04Merge branch 'ab/remove-implicit-use-of-the-repository' into ↵Junio C Hamano
en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-03-28libs: use "struct repository *" argument, not "the_repository"Ævar Arnfjörð Bjarmason
As can easily be seen from grepping in our sources, we had these uses of "the_repository" in various library code in cases where the function in question was already getting a "struct repository *" argument. Let's use that argument instead. Out of these changes only the changes to "cache-tree.c", "commit-reach.c", "shallow.c" and "upload-pack.c" would have cleanly applied before the migration away from the "repo_*()" wrapper macros in the preceding commits. The rest aren't new, as we'd previously implicitly refer to "the_repository", but it's now more obvious that we were doing the wrong thing all along, and should have used the parameter instead. The change to change "get_index_format_default(the_repository)" in "read-cache.c" to use the "r" variable instead should arguably have been part of [1], or in the subsequent cleanup in [2]. Let's do it here, as can be seen from the initial code in [3] it's not important that we use "the_repository" there, but would prefer to always use the current repository. This change excludes the "the_repository" use in "upload-pack.c"'s upload_pack_advertise(), as the in-flight [4] makes that change. 1. ee1f0c242ef (read-cache: add index.skipHash config option, 2023-01-06) 2. 6269f8eaad0 (treewide: always have a valid "index_state.repo" member, 2023-01-17) 3. 7211b9e7534 (repo-settings: consolidate some config settings, 2019-08-13) 4. <Y/hbUsGPVNAxTdmS@coredump.intra.peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "packfile.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason
Apply the part of "the_repository.pending.cocci" pertaining to "packfile.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "commit.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason
Apply the part of "the_repository.pending.cocci" pertaining to "commit.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27commit-graph: fix truncated generation numbersPatrick Steinhardt
In 80c928d947 (commit-graph: simplify compute_generation_numbers(), 2023-03-20), the code to compute generation numbers was simplified to use the same infrastructure as is used to compute topological levels. This refactoring introduced a bug where the generation numbers are truncated when they exceed UINT32_MAX because we explicitly cast the computed generation number to `uint32_t`. This is not required though: both the computed value and the field of `struct commit_graph_data` are of the same type `timestamp_t` already, so casting to `uint32_t` will cause truncation. This cast can cause us to miscompute generation data overflows: 1. Given a commit with no parents and committer date `UINT32_MAX + 1`. 2. We compute its generation number as `UINT32_MAX + 1`, but truncate it to `1`. 3. We calculate the generation offset via `$generation - $date`, which is thus `1 - (UINT32_MAX + 1)`. The computation underflows and we thus end up with an offset that is bigger than the maximum allowed offset. As a result, we'd be writing generation data overflow information into the commit-graph that is bogus and ultimately not even required. Fix this bug by removing the needless cast. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21csum-file.h: remove unnecessary inclusion of cache.hElijah Newren
With the change in the last commit to move several functions to write-or-die.h, csum-file.h no longer needs to include cache.h. However, removing that include forces several other C files, which directly or indirectly dependend upon csum-file.h's inclusion of cache.h, to now be more explicit about their dependencies. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21wrapper.h: move declarations for wrapper.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20commit-graph: introduce `ensure_generations_valid()`Taylor Blau
Use the just-introduced compute_reachable_generation_numbers_1() to implement a function which dynamically computes topological levels (or corrected commit dates) for out-of-graph commits. This will be useful for the ahead-behind algorithm we are about to introduce, which needs accurate topological levels on _all_ commits reachable from the tips in order to avoid over-counting. Co-authored-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20commit-graph: return generation from memoryDerrick Stolee
The commit_graph_generation() method used to report a value of GENERATION_NUMBER_INFINITY if the commit_graph_data_slab had an instance for the given commit but the graph_pos indicated the commit was not in the commit-graph file. However, an upcoming change will introduce the ability to set generation values in-memory without writing the commit-graph file. Thus, we can no longer trust 'graph_pos' to indicate whether or not the generation member can be trusted. Instead, trust the 'generation' member if the commit has a value in the slab _and_ the 'generation' member is non-zero. Otherwise, treat it as GENERATION_NUMBER_INFINITY. This only makes a difference for a very old case for the commit-graph: the very first Git release to write commit-graph files wrote zeroes in the topological level positions. If we are parsing a commit-graph with all zeroes, those commits will now appear to have GENERATION_NUMBER_INFINITY (as if they were not parsed from the commit-graph). I attempted several variations to work around the need for providing an uninitialized 'generation' member, but this was the best one I found. It does require a change to a verification test in t5318 because it reports a different error than the one about non-zero generation numbers. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20commit-graph: simplify compute_generation_numbers()Derrick Stolee
The previous change introduced the generic algorithm compute_reachable_generation_numbers() and used it as the core functionality of compute_topological_levels(). Now, use it as the core functionality of compute_generation_numbers(). The main difference here is that we use generation version 2, which is used in to toggle the logic in compute_generation_from_max() for computing the corrected commit date based on the corrected commit dates of the parent commits (and the commit date of the current commit). It also uses different methods for (get|set)_generation in the vtable in order to store and access the value in the correct places. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20commit-graph: refactor compute_topological_levels()Derrick Stolee
This patch extracts the common code used to compute topological levels and corrected committer dates into a common routine, compute_reachable_generation_numbers(). For ease of reading, it only modifies compute_topological_levels() to use this new routine, leaving compute_generation_numbers() to be modified in the next change. This new routine dispatches to call the necessary functions to get and set the generation number for a given commit through a vtable (the compute_generation_info struct). Computing the generation number itself is done in compute_generation_from_max(), which dispatches its implementation based on the generation version requested, or issuing a BUG() for unrecognized generation versions. This does not use a vtable because the logic depends only on the generation number version, not where the data is being loaded from or being stored to. This is a subtle point that will make more sense in a future change that modifies the in-memory generation values instead of just preparing values for writing to a commit-graph file. This change looks like it adds a lot of new code. However, two upcoming changes will be quite small due to the work being done in this change. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-09use DUP_ARRAYRené Scharfe
Add a semantic patch for replace ALLOC_ARRAY+COPY_ARRAY with DUP_ARRAY to reduce code duplication and apply its results. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22Merge branch 'ml/commit-graph-expire-dir-leak-fix'Junio C Hamano
A result from opendir() was leaking in the commit-graph expiration codepath, which has been plugged. * ml/commit-graph-expire-dir-leak-fix: commit-graph: Fix missing closedir in expire_commit_graphs
2022-09-19commit-graph: Fix missing closedir in expire_commit_graphsMiaoqian Lin
The function calls opendir() but missing the corresponding closedir() before exit the function. Add missing closedir() to fix it. Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-14Merge branch 'ab/unused-annotation'Junio C Hamano
Undoes 'jk/unused-annotation' topic and redoes it to work around Coccinelle rules misfiring false positives in unrelated codepaths. * ab/unused-annotation: git-compat-util.h: use "deprecated" for UNUSED variables git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14Merge branch 'jk/unused-annotation'Junio C Hamano
Annotate function parameters that are not used (but cannot be removed for structural reasons), to prepare us to later compile with -Wunused warning turned on. * jk/unused-annotation: is_path_owned_by_current_uid(): mark "report" parameter as unused run-command: mark unused async callback parameters mark unused read_tree_recursive() callback parameters hashmap: mark unused callback parameters config: mark unused callback parameters streaming: mark unused virtual method parameters transport: mark bundle transport_options as unused refs: mark unused virtual method parameters refs: mark unused reflog callback parameters refs: mark unused each_ref_fn parameters git-compat-util: add UNUSED macro
2022-09-13Merge branch 'jk/rev-list-verify-objects-fix'Junio C Hamano
"git rev-list --verify-objects" ought to inspect the contents of objects and notice corrupted ones, but it didn't when the commit graph is in use, which has been corrected. * jk/rev-list-verify-objects-fix: rev-list: disable commit graph with --verify-objects lookup_commit_in_graph(): use prepare_commit_graph() to check for graph
2022-09-07lookup_commit_in_graph(): use prepare_commit_graph() to check for graphJeff King
We exit early from lookup_commit_in_graph() if the commit_graph pointer is NULL, under the assumption that we don't have a graph to look at. But the graph pointer is lazy-loaded; if no other code happens to have called prepare_commit_graph(), we'll incorrectly assume that one isn't available at all. This has a pretty small performance impact in practice, because the fallback will generally be to call parse_object() instead. That ends up in parse_commit_buffer(), which loads the graph data itself. So the first commit we see won't use the graph, but subsequent ones will. Since using the graph is just an optimization there's generally no user-visible difference, but if you instrument rev-list like so: diff --git a/revision.c b/revision.c index ee702e498a..63c488ffb6 100644 --- a/revision.c +++ b/revision.c @@ -381,6 +381,9 @@ static struct object *get_reference(struct rev_info *revs, const char *name, * parsing commit data from disk. */ commit = lookup_commit_in_graph(revs->repo, oid); + warning("%s %s in commit graph", + commit ? "found" : "did not find", + name); if (commit) object = &commit->object; else and run (in git.git): git commit-graph write --reachable git rev-list origin/master origin/next >/dev/null you'll see that we fail to find the first one: warning: did not find origin/master in commit graph warning: found origin/next in commit graph After this patch, you'll see that we find both: warning: found origin/master in commit graph warning: found origin/next in commit graph Even though the performance implication is small here, there are two important reasons to do this: - it's downright confusing if you are hunting a bug triggered by the use of the commit graph. It may or may not trigger depending on the number and ordering of tips you ask for. - prepare_commit_graph() has other policy logic, too. In particular, if we've loaded a commit graph and then disabled the graph via disable_commit_graph(), that should take precedence. I'm not sure if this can trigger bad behavior in practice. The only caller there is upload-pack's deepen_by_rev_list(), which should be avoiding the commit graph for its traversal tips, but probably wasn't before this patch. Whether you could come up with a case where that mattered is unclear. Still, this is obviously the right thing to be doing. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01git-compat-util.h: use "UNUSED", not "UNUSED(var)"Ævar Arnfjörð Bjarmason
As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of 9b240347543 (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19refs: mark unused each_ref_fn parametersJeff King
Functions used with for_each_ref(), etc, need to conform to the each_ref_fn interface. But most of them don't need every parameter; let's annotate the unused ones to quiet -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11Merge branch 'tb/commit-graph-genv2-upgrade-fix' into maintJunio C Hamano
There was a bug in the codepath to upgrade generation information in commit-graph from v1 to v2 format, which has been corrected. source: <cover.1657667404.git.me@ttaylorr.com> * tb/commit-graph-genv2-upgrade-fix: commit-graph: fix corrupt upgrade from generation v1 to v2 commit-graph: introduce `repo_find_commit_pos_in_graph()` t5318: demonstrate commit-graph generation v2 corruption
2022-08-11Merge branch 'hx/lookup-commit-in-graph-fix' into maintJunio C Hamano
A corner case bug where lazily fetching objects from a promisor remote resulted in infinite recursion has been corrected. source: <cover.1656593279.git.hanxin.hx@bytedance.com> * hx/lookup-commit-in-graph-fix: t5330: remove run_with_limited_processses() commit-graph.c: no lazy fetch in lookup_commit_in_graph()
2022-08-03Merge branch 'tb/commit-graph-genv2-upgrade-fix'Junio C Hamano
There was a bug in the codepath to upgrade generation information in commit-graph from v1 to v2 format, which has been corrected. * tb/commit-graph-genv2-upgrade-fix: commit-graph: fix corrupt upgrade from generation v1 to v2 commit-graph: introduce `repo_find_commit_pos_in_graph()` t5318: demonstrate commit-graph generation v2 corruption
2022-07-27Merge branch 'js/commit-graph-parsing-without-repo-settings'Junio C Hamano
API tweak to make it easier to run fuzz testing on commit-graph parser. * js/commit-graph-parsing-without-repo-settings: commit-graph: pass repo_settings instead of repository
2022-07-20Merge branch 'hx/lookup-commit-in-graph-fix'Junio C Hamano
A corner case bug where lazily fetching objects from a promisor remote resulted in infinite recursion has been corrected. * hx/lookup-commit-in-graph-fix: t5330: remove run_with_limited_processses() commit-graph.c: no lazy fetch in lookup_commit_in_graph()
2022-07-16commit-graph: introduce `repo_find_commit_pos_in_graph()`Taylor Blau
Low-level callers in systems that are adjacent to the commit-graph (like the changed-path Bloom filter code) could benefit from being able to call a function like `parse_commit_in_graph()` without modifying the corresponding commit slab data. This is useful in contexts where that slab data is being used to prepare for an upcoming commit-graph write, where Git must be careful to avoid clobbering any of that data during a read operation. Introduce a low-level variant of `parse_commit_in_graph()` which returns the graph position of a given commit only, without modifying any of the slab data. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-15commit-graph: pass repo_settings instead of repositoryTaylor Blau
The parse_commit_graph() function takes a 'struct repository *' pointer, but it only ever accesses config settings (either directly or through the .settings field of the repo struct). Move all relevant config settings into the repo_settings struct, and update parse_commit_graph() and its existing callers so that it takes 'struct repo_settings *' instead. Callers of parse_commit_graph() will now need to call prepare_repo_settings() themselves, or initialize a 'struct repo_settings' directly. Prior to ab14d0676c (commit-graph: pass a 'struct repository *' in more places, 2020-09-09), parsing a commit-graph was a pure function depending only on the contents of the commit-graph itself. Commit ab14d0676c introduced a dependency on a `struct repository` pointer, and later commits such as b66d84756f (commit-graph: respect 'commitGraph.readChangedPaths', 2020-09-09) added dependencies on config settings, which were accessed through the `settings` field of the repository pointer. This field was initialized via a call to `prepare_repo_settings()`. Additionally, this fixes an issue in fuzz-commit-graph: In 44c7e62 (2021-12-06, repo-settings:prepare_repo_settings only in git repos), prepare_repo_settings was changed to issue a BUG() if it is called by a process whose CWD is not a Git repository. The combination of commits mentioned above broke fuzz-commit-graph, which attempts to parse arbitrary fuzzing-engine-provided bytes as a commit graph file. Prior to this change, parse_commit_graph() called prepare_repo_settings(), but since we run the fuzz tests without a valid repository, we are hitting the BUG() from 44c7e62 for every test case. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01commit-graph.c: no lazy fetch in lookup_commit_in_graph()Han Xin
The commit-graph is used to opportunistically optimize accesses to certain pieces of information on commit objects, and lookup_commit_in_graph() tries to say "no" when the requested commit does not locally exist by returning NULL, in which case the caller can ask for (which may result in on-demand fetching from a promisor remote) and parse the commit object itself. However, it uses a wrong helper, repo_has_object_file(), to do so. This helper not only checks if an object is mmediately available in the local object store, but also tries to fetch from a promisor remote. But the fetch machinery calls lookup_commit_in_graph(), thus causing an infinite loop. We should make lookup_commit_in_graph() expect that a commit given to it can be legitimately missing from the local object store, by using the has_object_file() helper instead. Signed-off-by: Han Xin <hanxin.hx@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-04Merge branch 'tb/cruft-packs'Junio C Hamano
A mechanism to pack unreachable objects into a "cruft pack", instead of ejecting them into loose form to be reclaimed later, has been introduced. * tb/cruft-packs: sha1-file.c: don't freshen cruft packs builtin/gc.c: conditionally avoid pruning objects via loose builtin/repack.c: add cruft packs to MIDX during geometric repack builtin/repack.c: use named flags for existing_packs builtin/repack.c: allow configuring cruft pack generation builtin/repack.c: support generating a cruft pack builtin/pack-objects.c: --cruft with expiration reachable: report precise timestamps from objects in cruft packs reachable: add options to add_unseen_recent_objects_to_traversal builtin/pack-objects.c: --cruft without expiration builtin/pack-objects.c: return from create_object_entry() t/helper: add 'pack-mtimes' test-tool pack-mtimes: support writing pack .mtimes files chunk-format.h: extract oid_version() pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' pack-mtimes: support reading .mtimes files Documentation/technical: add cruft-packs.txt
2022-05-27chunk-format.h: extract oid_version()Taylor Blau
There are three definitions of an identical function which converts `the_hash_algo` into either 1 (for SHA-1) or 2 (for SHA-256). There is a copy of this function for writing both the commit-graph and multi-pack-index file, and another inline definition used to write the .rev header. Consolidate these into a single definition in chunk-format.h. It's not clear that this is the best header to define this function in, but it should do for now. (Worth noting, the .rev caller expects a 4-byte unsigned, but the other two callers work with a single unsigned byte. The consolidated version uses the latter type, and lets the compiler widen it when required). Another caller will be added in a subsequent patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-24Merge branch 'ab/valgrind-fixes'Junio C Hamano
A bit of test framework fixes with a few fixes to issues found by valgrind. * ab/valgrind-fixes: commit-graph.c: don't assume that stat() succeeds object-file: fix a unpack_loose_header() regression in 3b6a8db3b03 log test: skip a failing mkstemp() test under valgrind tests: using custom GIT_EXEC_PATH breaks --valgrind tests
2022-05-21Merge branch 'ep/maint-equals-null-cocci'Junio C Hamano
Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
2022-05-13commit-graph.c: don't assume that stat() succeedsÆvar Arnfjörð Bjarmason
Fix code added in 8d84097f965 (commit-graph: expire commit-graph files, 2019-06-18) to check the return value of the stat() system call. Not doing so caused us to use uninitialized memory in the "Bloom generation is limited by --max-new-filters" test in t4216-log-bloom.sh: + rm -f trace.event + pwd + GIT_TRACE2_EVENT=[...]/t/trash directory.t4216-log-bloom/limits/trace.event git commit-graph write --reachable --split=replace --changed-paths --max-new-filters=2 ==24835== Syscall param utimensat(times[0].tv_sec) points to uninitialised byte(s) ==24835== at 0x499E65A: __utimensat64_helper (utimensat.c:34) ==24835== by 0x4999142: utime (utime.c:36) ==24835== by 0x552BE0: mark_commit_graphs (commit-graph.c:2213) ==24835== by 0x550822: write_commit_graph (commit-graph.c:2424) ==24835== by 0x54E3A0: write_commit_graph_reachable (commit-graph.c:1681) ==24835== by 0x4374BB: graph_write (commit-graph.c:269) ==24835== by 0x436F7D: cmd_commit_graph (commit-graph.c:326) ==24835== by 0x407B9A: run_builtin (git.c:465) ==24835== by 0x406651: handle_builtin (git.c:719) ==24835== by 0x407575: run_argv (git.c:786) ==24835== by 0x406410: cmd_main (git.c:917) ==24835== by 0x511F09: main (common-main.c:56) ==24835== Address 0x1ffeffde70 is on thread 1's stack ==24835== in frame #1, created by utime (utime.c:25) ==24835== Uninitialised value was created by a stack allocation ==24835== at 0x552B50: mark_commit_graphs (commit-graph.c:2201) ==24835== [...] error: last command exited with $?=126 not ok 137 - Bloom generation is limited by --max-new-filters This would happen as we stat'd the non-existing ".git/objects/info/commit-graph" file. Let's fix mark_commit_graphs() to check the stat()'s return value, and while we're at it fix another case added in the same commit to do the same. The caller in expire_commit_graphs() would have been less likely to run into this, as it's operating on files it just got from readdir(), but it could still happen due to a race with e.g. a concurrent "rm -rf" of the commit-graph files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-02Merge branch 'ep/maint-equals-null-cocci' for maint-2.35Junio C Hamano
* ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-20commit-graph: close file before returning NULLKleber Tarcísio
There are two reasons that we could return NULL early within load_commit_graph_chain(): 1. The file does not exist, so the file pointer is NULL. 2. The file exists, but is too small to contain a single hash. These were grouped together when the function was first written in 5c84b3396 (commit-graph: load commit-graph chains, 2019-06-18) in order to simplify how the 'chain_name' string is freed. However, the current code leaves a narrow window where the file pointer is not closed when the file exists, but is rejected for being too small. Split out these cases separately to ensure we close the file in this case. Signed-off-by: Kleber Tarcísio <klebertarcisio@yahoo.com.br> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-26Merge branch 'ns/core-fsyncmethod'Junio C Hamano
Replace core.fsyncObjectFiles with two new configuration variables, core.fsync and core.fsyncMethod. * ns/core-fsyncmethod: core.fsync: documentation and user-friendly aggregate options core.fsync: new option to harden the index core.fsync: add configuration parsing core.fsync: introduce granular fsync control infrastructure core.fsyncmethod: add writeout-only mode wrapper: make inclusion of Windows csprng header tightly scoped
2022-03-17Merge branch 'ab/string-list-count-in-size-t'Junio C Hamano
Count string_list items in size_t, not "unsigned int". * ab/string-list-count-in-size-t: string-list API: change "nr" and "alloc" to "size_t" gettext API users: don't explicitly cast ngettext()'s "n"
2022-03-17Merge branch 'ds/commit-graph-gen-v2-fixes'Junio C Hamano
Fixes to the way generation number v2 in the commit-graph files are (not) handled. * ds/commit-graph-gen-v2-fixes: commit-graph: declare bankruptcy on GDAT chunks commit-graph: fix generation number v2 overflow values commit-graph: start parsing generation v2 (again) commit-graph: fix ordering bug in generation numbers t5318: extract helpers to lib-commit-graph.sh test-read-graph: include extra post-parse info
2022-03-11core.fsync: introduce granular fsync control infrastructureNeeraj Singh
This commit introduces the infrastructure for the core.fsync configuration knob. The repository components we want to sync are identified by flags so that we can turn on or off syncing for specific components. If core.fsyncObjectFiles is set and the core.fsync configuration also includes FSYNC_COMPONENT_LOOSE_OBJECT, we will fsync any loose objects. This picks the strictest data integrity behavior if core.fsync and core.fsyncObjectFiles are set to conflicting values. This change introduces the currently unused fsync_component helper, which will be used by a later patch that adds fsyncing to the refs backend. Actual configuration and documentation of the fsync components list are in other patches in the series to separate review of the underlying mechanism from the policy of how it's configured. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>