git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2023-05-10	Merge branch 'en/header-split-cache-h-part-2'	Junio C Hamano
	More header clean-up. * en/header-split-cache-h-part-2: (22 commits) reftable: ensure git-compat-util.h is the first (indirect) include diff.h: reduce unnecessary includes object-store.h: reduce unnecessary includes commit.h: reduce unnecessary includes fsmonitor: reduce includes of cache.h cache.h: remove unnecessary headers treewide: remove cache.h inclusion due to previous changes cache,tree: move basic name compare functions from read-cache to tree cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c hash-ll.h: split out of hash.h to remove dependency on repository.h tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h dir.h: move DTYPE defines from cache.h versioncmp.h: move declarations for versioncmp.c functions from cache.h ws.h: move declarations for ws.c functions from cache.h match-trees.h: move declarations for match-trees.c functions from cache.h pkt-line.h: move declarations for pkt-line.c functions from cache.h base85.h: move declarations for base85.c functions from cache.h copy.h: move declarations for copy.c functions from cache.h server-info.h: move declarations for server-info.c functions from cache.h packfile.h: move pack_window and pack_entry from cache.h ...
2023-05-10	Merge branch 'jk/parse-commit-with-malformed-ident'	Junio C Hamano
	The commit object parser has been taught to be a bit more lenient to parse timestamps on the author/committer line with a malformed author/committer ident. * jk/parse-commit-with-malformed-ident: parse_commit(): describe more date-parsing failure modes parse_commit(): handle broken whitespace-only timestamp parse_commit(): parse timestamp from end of line t4212: avoid putting git on left-hand side of pipe
2023-05-02	Merge branch 'tb/ban-strtok'	Junio C Hamano
	Mark strtok() and strtok_r() to be banned. * tb/ban-strtok: banned.h: mark `strtok()` and `strtok_r()` as banned t/helper/test-json-writer.c: avoid using `strtok()` t/helper/test-oidmap.c: avoid using `strtok()` t/helper/test-hashmap.c: avoid using `strtok()` string-list: introduce `string_list_setlen()` string-list: multi-delimiter `string_list_split_in_place()`
2023-05-02	Merge branch 'jk/blame-fake-commit-label'	Junio C Hamano
	The output given by "git blame" that attributes a line to contents taken from the file specified by the "--contents" option shows it differently from a line attributed to the working tree file. * jk/blame-fake-commit-label: blame: use different author name for fake commit generated by --contents
2023-04-29	Merge branch 'jk/gpg-trust-level-fix'	Junio C Hamano
	The "%GT" placeholder for the "--format" option of "git log" and friends caused BUG() to trigger on a commit signed with an unknown key, which has been corrected. * jk/gpg-trust-level-fix: gpg-interface: set trust level of missing key to "undefined"
2023-04-29	Merge branch 'tb/enable-cruft-packs-by-default'	Junio C Hamano
	When "gc" needs to retain unreachable objects, packing them into cruft packs (instead of exploding them into loose object files) has been offered as a more efficient option for some time. Now the use of cruft packs has been made the default and no longer considered an experimental feature. * tb/enable-cruft-packs-by-default: repository.h: drop unused `gc_cruft_packs` builtin/gc.c: make `gc.cruftPacks` enabled by default t/t9300-fast-import.sh: prepare for `gc --cruft` by default t/t6500-gc.sh: add additional test cases t/t6500-gc.sh: refactor cruft pack tests t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default t/t5304-prune.sh: prepare for `gc --cruft` by default builtin/gc.c: ignore cruft packs with `--keep-largest-pack` builtin/repack.c: fix incorrect reference to '-C' pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-28	Merge branch 'ds/fsck-pack-revindex'	Junio C Hamano
	"git fsck" learned to validate the on-disk pack reverse index files. * ds/fsck-pack-revindex: fsck: validate .rev file header fsck: check rev-index position values fsck: check rev-index checksums fsck: create scaffolding for rev-index checks
2023-04-28	Merge branch 'tb/pack-revindex-on-disk'	Junio C Hamano
	The on-disk reverse index that allows mapping from the pack offset to the object name for the object stored at the offset has been enabled by default. * tb/pack-revindex-on-disk: t: invert `GIT_TEST_WRITE_REV_INDEX` config: enable `pack.writeReverseIndex` by default pack-revindex: introduce `pack.readReverseIndex` pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK pack-revindex: make `load_pack_revindex` take a repository t5325: mark as leak-free pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-27	parse_commit(): handle broken whitespace-only timestamp	Jeff King
	The comment in parse_commit_date() claims that parse_timestamp() will not walk past the end of the buffer we've been given, since it will hit the newline at "eol" and stop. This is usually true, when dateptr contains actual numbers to parse. But with a line like: committer name <email> \n with just whitespace, and no numbers, parse_timestamp() will consume that newline as part of the leading whitespace, and we may walk past our "tail" pointer (which itself is set from the "size" parameter passed in to parse_commit_buffer()). In practice this can't cause us to walk off the end of an array, because we always add an extra NUL byte to the end of objects we load from disk (as a defense against exactly this kind of bug). However, you can see the behavior in action when "committer" is the final header (which it usually is, unless there's an encoding) and the subject line can be parsed as an integer. We walk right past the newline on the committer line, as well as the "\n\n" separator, and mistake the subject for the timestamp. We can solve this by trimming the whitespace ourselves, making sure that it has some non-whitespace to parse. Note that we need to be a bit careful about the definition of "whitespace" here, as our isspace() doesn't match exotic characters like vertical tab or formfeed. We can work around that by checking for an actual number (see the in-code comment). This is slightly more restrictive than the current code, but in practice the results are either the same (we reject "foo" as "0", but so would parse_timestamp()) or extremely unlikely even for broken commits (parse_timestamp() would allow "\v123" as "123", but we'll now make it "0"). I did also allow "-" here, which may be controversial, as we don't currently support negative timestamps. My reasoning was two-fold. One, the design of parse_timestamp() is such that we should be able to easily switch it to handling signed values, and this otherwise creates a hard-to-find gotcha that anybody doing that work would get tripped up on. And two, the status quo is that we currently parse them, though the result of course ends up as a very large unsigned value (which is likely to just get clamped to "0" for display anyway, since our date routines can't handle it). The new test checks the commit parser (via "--until") for both vanilla spaces and the vertical-tab case. I also added a test to check these against the pretty-print formatter, which uses split_ident_line(). It's not subject to the same bug, because it already insists that there be one or more digits in the timestamp. Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27	parse_commit(): parse timestamp from end of line	Jeff King
	To find the committer timestamp, we parse left-to-right looking for the closing ">" of the email, and then expect the timestamp right after that. But we've seen some broken cases in the wild where this fails, but we _could_ find the timestamp with a little extra work. E.g.: Name <Name<email>> 123456789 -0500 This means that features that rely on the committer timestamp, like --since or --until, will treat the commit as happening at time 0 (i.e., 1970). This is doubly confusing because the pretty-print parser learned to handle these in 03818a4a94 (split_ident: parse timestamp from end of line, 2013-10-14). So printing them via "git show", etc, makes everything look normal, but --until, etc are still broken (despite the fact that that commit explicitly mentioned --until!). So let's use the same trick as 03818a4a94: find the end of the line, and parse back to the final ">". In theory we could use split_ident_line() here, but it's actually a bit more strict. In particular, it requires a valid time-zone token, too. That should be present, of course, but we wouldn't want to break --until for cases that are working currently. We might want to teach split_ident_line() to become more lenient there, but it would require checking its many callers (since right now they can assume that if date_start is non-NULL, so is tz_start). So for now we'll just reimplement the same trick in the commit parser. The test is in t4212, which already covers similar cases, courtesy of 03818a4a94. We'll just adjust the broken commit to munge both the author and committer timestamps. Note that we could match (author\|committer) here, but alternation can't be used portably in sed. Since we wouldn't expect to see ">" except as part of an ident line, we can just match that character on any line. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27	t4212: avoid putting git on left-hand side of pipe	Jeff King
	We wouldn't expect cat-file to fail here, but it's good practice to avoid putting git on the upstream of a pipe, as we otherwise ignore its exit code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-25	Merge branch 'ps/fix-geom-repack-with-alternates'	Junio C Hamano
	Geometric repacking ("git repack --geometric=<n>") in a repository that borrows from an alternate object database had various corner case bugs, which have been corrected. * ps/fix-geom-repack-with-alternates: repack: disable writing bitmaps when doing a local repack repack: honor `-l` when calculating pack geometry t/helper: allow chmtime to print verbosely without modifying mtime pack-objects: extend test coverage of `--stdin-packs` with alternates pack-objects: fix error when same packfile is included and excluded pack-objects: fix error when packing same pack twice pack-objects: split out `--stdin-packs` tests into separate file repack: fix generating multi-pack-index with only non-local packs repack: fix trying to use preferred pack in alternates midx: fix segfault with no packs and invalid preferred pack
2023-04-25	Merge branch 'rj/send-email-validate-hook-count-messages'	Junio C Hamano
	The sendemail-validate validate hook learned to pass the total number of input files and where in the sequence each invocation is via environment variables. * rj/send-email-validate-hook-count-messages: send-email: export patch counters in validate environment
2023-04-25	Merge branch 'jk/protocol-cap-parse-fix'	Junio C Hamano
	The code to parse capability list for v0 on-wire protocol fell into an infinite loop when a capability appears multiple times, which has been corrected. * jk/protocol-cap-parse-fix: v0 protocol: use size_t for capability length/offset t5512: test "ls-remote --heads --symref" filtering with v0 and v2 t5512: allow any protocol version for filtered symref test t5512: add v2 support for "ls-remote --symref" test v0 protocol: fix sha1/sha256 confusion for capabilities^{} t5512: stop referring to "v1" protocol v0 protocol: fix infinite loop when parsing multi-valued capabilities
2023-04-25	Merge branch 'en/header-split-cache-h'	Junio C Hamano
	Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...
2023-04-25	Sync with Git 2.40.1	Junio C Hamano

2023-04-25	t/helper/test-json-writer.c: avoid using `strtok()`	Taylor Blau
	Apply similar treatment as in the previous commit to remove usage of `strtok()` from the "oidmap" test helper. Each of the different commands that the "json-writer" helper accepts pops the next space-delimited token from the current line and interprets it as a string, integer, or double (with the exception of the very first token, which is the command itself). To accommodate this, split the line in place by the space character, and pass the corresponding string_list to each of the specialized `get_s()`, `get_i()`, and `get_d()` functions. `get_i()` and `get_d()` are thin wrappers around `get_s()` that convert their result into the appropriate type by either calling `strtol()` or `strtod()`, respectively. In `get_s()`, we mark the token as "consumed" by incrementing the `consumed_nr` counter, indicating how many tokens we have read up to that point. Because each of these functions needs the string-list parts, the number of tokens consumed, and the line number, these three are wrapped up in to a struct representing the line state. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-25	t/helper/test-oidmap.c: avoid using `strtok()`	Taylor Blau
	Apply similar treatment as in the previous commit to remove usage of `strtok()` from the "oidmap" test helper. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-25	t/helper/test-hashmap.c: avoid using `strtok()`	Taylor Blau
	Avoid using the non-reentrant `strtok()` to separate the parts of each incoming command. Instead of replacing it with `strtok_r()`, let's instead use the more friendly pair of `string_list_split_in_place()` and `string_list_remove_empty_items()`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-25	string-list: multi-delimiter `string_list_split_in_place()`	Taylor Blau
	Enhance `string_list_split_in_place()` to accept multiple characters as delimiters instead of a single character. Instead of using `strchr(2)` to locate the first occurrence of the given delimiter character, `string_list_split_in_place_multi()` uses `strcspn(2)` to move past the initial segment of characters comprised of any characters in the delimiting set. When only a single delimiting character is provided, `strpbrk(2)` (which is implemented with `strcspn(2)`) has equivalent performance to `strchr(2)`. Modern `strcspn(2)` implementations treat an empty delimiter or the singleton delimiter as a special case and fall back to calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)` this way. This change is one step to removing `strtok(2)` from the tree. Note that `string_list_split_in_place()` is not a strict replacement for `strtok()`, since it will happily turn sequential delimiter characters into empty entries in the resulting string_list. For example: string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1) would yield a string list of: ["foo", "", "", "bar", "", "", "baz"] Callers that wish to emulate the behavior of strtok(2) more directly should call `string_list_remove_empty_items()` after splitting. To avoid regressions for the new multi-character delimter cases, update t0063 in this patch as well. [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35 [2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11 Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-25	blame: use different author name for fake commit generated by --contents	Jacob Keller
	When the --contents option is used with git blame, and the contents of the file have lines which can't be annotated by the history being blamed, the user will see an author of "Not Committed Yet". This is similar to the way blame handles working tree contents when blaming without a revision. This is slightly confusing since this data isn't the working copy and while it is technically "not committed yet", its also coming from an external file. Replace this author name with "External file (--contents)" to better differentiate such lines from actual working copy lines. Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Glen Choo <chooglen@google.com> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24	reftable: ensure git-compat-util.h is the first (indirect) include	Elijah Newren
	Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24	treewide: remove cache.h inclusion due to previous changes	Elijah Newren
	Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24	hash-ll.h: split out of hash.h to remove dependency on repository.h	Elijah Newren
	hash.h depends upon and includes repository.h, due to the definition and use of the_hash_algo (defined as the_repository->hash_algo). However, most headers trying to include hash.h are only interested in the layout of the structs like object_id. Move the parts of hash.h that do not depend upon repository.h into a new file hash-ll.h (the "low level" parts of hash.h), and adjust other files to use this new header where the convenience inline functions aren't needed. This allows hash.h and object.h to be fairly small, minimal headers. It also exposes a lot of hidden dependencies on both path.h (which was brought in by repository.h) and repository.h (which was previously implicitly brought in by object.h), so also adjust other files to be more explicit about what they depend upon. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24	match-trees.h: move declarations for match-trees.c functions from cache.h	Elijah Newren
	Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24	packfile.h: move pack_window and pack_entry from cache.h	Elijah Newren
	Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24	treewide: be explicit about dependence on strbuf.h	Elijah Newren
	Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-22	Merge branch 'ow/ref-filter-omit-empty'	Junio C Hamano
	"git branch --format=..." and "git format-patch --format=..." learns "--omit-empty" to hide refs that whose formatting result becomes an empty string from the output. * ow/ref-filter-omit-empty: branch, for-each-ref, tag: add option to omit empty lines
2023-04-22	Merge branch 'ah/format-patch-thread-doc'	Junio C Hamano
	Doc update. * ah/format-patch-thread-doc: format-patch: correct documentation of --thread without an argument
2023-04-22	Merge branch 'rn/sparse-describe'	Junio C Hamano
	"git describe --dirty" learns to work better with sparse-index. * rn/sparse-describe: describe: enable sparse index for describe
2023-04-22	Merge branch 'rs/archive-from-subdirectory-fixes'	Junio C Hamano
	"git archive" run from a subdirectory mishandled attributes and paths outside the current directory. * rs/archive-from-subdirectory-fixes: archive: improve support for running in subdirectory
2023-04-21	Merge branch 'gc/better-error-when-local-clone-fails-with-symlink'	Junio C Hamano
	"git clone --local" stops copying from an original repository that has symbolic links inside its $GIT_DIR; an error message when that happens has been updated. * gc/better-error-when-local-clone-fails-with-symlink: clone: error specifically with --local and symlinked objects
2023-04-21	Merge branch 'ar/t2024-checkout-output-fix'	Junio C Hamano
	Test fix. * ar/t2024-checkout-output-fix: t2024: fix loose/strict local base branch DWIM test
2023-04-21	Merge branch 'rs/remove-approxidate-relative'	Junio C Hamano
	The approxidate() API has been simplified by losing an extra function that did the same thing as another one. * rs/remove-approxidate-relative: date: remove approxidate_relative()
2023-04-21	Merge branch 'rs/userdiff-multibyte-regex'	Junio C Hamano
	The userdiff regexp patterns for various filetypes that are built into the system have been updated to avoid triggering regexp errors from UTF-8 aware regex engines. * rs/userdiff-multibyte-regex: userdiff: support regexec(3) with multi-byte support
2023-04-19	gpg-interface: set trust level of missing key to "undefined"	Jeff King
	In check_signature(), we initialize the trust_level field to "-1", with the idea that if gpg does not return a trust level at all (if there is no signature, or if the signature is made by an unknown key), we'll use that value. But this has two problems: 1. Since the field is an enum, it's up to the compiler to decide what underlying storage to use, and it only has to fit the values we've declared. So we may not be able to store "-1" at all. And indeed, on my system (linux with gcc), the resulting enum is an unsigned 32-bit value, and -1 becomes 4294967295. The difference may seem academic (and you even get "-1" if you pass it to printf("%d")), but it means that code like this: status \|= sigc->trust_level < configured_min_trust_level; does not necessarily behave as expected. This turns out not to be a bug in practice, though, because we keep the "-1" only when gpg did not report a signature from a known key, in which case the line above: status \|= sigc->result != 'G'; would always set status to non-zero anyway. So only a 'G' signature with no parsed trust level would cause a problem, which doesn't seem likely to trigger (outside of unexpected gpg behavior). 2. When using the "%GT" format placeholder, we pass the value to gpg_trust_level_to_str(), which complains that the value is out of range with a BUG(). This behavior was introduced by 803978da49 (gpg-interface: add function for converting trust level to string, 2022-07-11). Before that, we just did a switch() on the enum, and anything that wasn't matched would end up as the empty string. Curiously, solving this by naively doing: if (level < 0) return ""; in that function isn't sufficient. Because of (1) above, the compiler can (and does in my case) actually remove that conditional as dead code! We can solve both by representing this state as an enum value. We could do this by adding a new "unknown" value. But this really seems to match the existing "undefined" level well. GPG describes this as "Not enough information for calculation". We have tests in t7510 that trigger this case (verifying a signature from a key that we don't have, and then checking various %G placeholders), but they didn't notice the BUG() because we didn't look at %GT for that case! Let's make sure we check all %G placeholders for each case in the formatting tests. The interesting ones here are "show unknown signature with custom format" and "show lack of signature with custom format", both of which would BUG() before, and now turn %GT into "undefined". Prior to 803978da49 they would have turned it into the empty string, but I think saying "undefined" consistently is a reasonable outcome, and probably makes life easier for anyone parsing the output (and any such parser had to be ready to see "undefined" already). The other modified tests produce the same output before and after this patch, but now we're consistently checking both %G? and %GT in all of them. Signed-off-by: Jeff King <peff@peff.net> Reported-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19	builtin/gc.c: make `gc.cruftPacks` enabled by default	Taylor Blau
	Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects via loose, 2022-05-20), `git gc` learned the `--cruft` option and `gc.cruftPacks` configuration to opt-in to writing cruft packs when collecting or pruning unreachable objects. Cruft packs were introduced with the merge in a50036da1a (Merge branch 'tb/cruft-packs', 2022-06-03). They address the problem of "loose object explosions", where Git will write out many individual loose objects when there is a large number of unreachable objects that have not yet aged past `--prune=<date>`. Instead of keeping track of those unreachable yet recent objects via their loose object file's mtime, cruft packs collect all unreachable objects into a single pack with a corresponding `*.mtimes` file that acts as a table to store the mtimes of all unreachable objects. This prevents the need to store unreachable objects as loose as they age out of the repository, and avoids the problem of loose object explosions. Beyond avoiding loose object explosions, cruft packs also act as a more efficient mechanism to store unreachable objects as they age out of a repository. This is because pairs of similar unreachable objects serve as delta bases for one another. In 5b92477f89, the feature was introduced as experimental. Since then, GitHub has been running these patches in every repository generating hundreds of millions of cruft packs along the way. The feature is battle-tested, and avoids many pathological cases such as above. Users who either run `git gc` manually, or via `git maintenance` can benefit from having cruft packs. As such, enable cruft pack generation to take place by default (by making `gc.cruftPacks` have the default of "true" rather than "false). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19	t/t9300-fast-import.sh: prepare for `gc --cruft` by default	Taylor Blau
	In a similar fashion as previous commits, adjust the fast-import tests to prepare for "git gc" generating a cruft pack by default. This adjustment is slightly different, however. Instead of relying on us writing out the objects loose, and then calling `git prune` to remove them, t9300 needs to be prepared to drop objects that would be moved into cruft packs. To do this, we can combine the `git gc` invocation with `git prune` into one `git gc --prune`, which handles pruning both loose objects, and objects that would otherwise be written to a cruft pack. Likely this pattern of "git gc && git prune" started all the way back in 03db4525d3 (Support gitlinks in fast-import., 2008-07-19), which happened after deprecating `git gc --prune` in 9e7d501990 (builtin-gc.c: deprecate --prune, it now really has no effect, 2008-05-09). After `--prune` was un-deprecated in 58e9d9d472 (gc: make --prune useful again by accepting an optional parameter, 2009-02-14), this script got a handful of new "git gc && git prune" instances via via 4cedb78cb5 (fast-import: add input format tests, 2011-08-11). These could have been `git gc --prune`, but weren't (likely taking after 03db4525d3). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19	t/t6500-gc.sh: add additional test cases	Taylor Blau
	In the last commit, we refactored some of the tests in t6500 to make clearer when cruft packs will and won't be generated by `git gc`. Add the remaining cases not covered by the previous patch into this one, which enumerates all possible combinations of arguments that will produce (or not produce) a cruft pack. This prepares us for a future commit which will change the default value of `gc.cruftPacks` by ensuring that we understand which invocations do and do not change as a result. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19	t/t6500-gc.sh: refactor cruft pack tests	Taylor Blau
	In 12253ab6d0 (gc: add tests for --cruft and friends, 2022-10-26), we added a handful of tests to t6500 to ensure that `git gc` respected the value of `--cruft` and `gc.cruftPacks`. Then, in c695592850 (config: let feature.experimental imply gc.cruftPacks=true, 2022-10-26), another set of similar tests was added to ensure that `feature.experimental` correctly implied enabling cruft pack generation (or not). These tests are similar and could be consolidated. Do so in this patch to prepare for expanding the set of command-line invocations that enable or disable writing cruft packs. This makes it possible to easily test more combinations of arguments without being overly repetitive. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19	t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default	Taylor Blau
	In a similar spirit as previous commits, prepare for `gc --cruft` becoming the default by ensuring that the tests in t6501 explicitly cover the case of freshening loose objects not using cruft packs. We could run this test twice, once with `--cruft` and once with `--no-cruft`, but doing so is unnecessary, since we already test object rescuing, freshening, and dealing with corrupt parts of the unreachable object graph extensively via t5329. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19	t/t5304-prune.sh: prepare for `gc --cruft` by default	Taylor Blau
	Many of the tests in t5304 run `git gc`, and rely on its behavior that unreachable-but-recent objects are written out loose. This is sensible, since t5304 deals specifically with this kind of pruning. If left unattended, however, this test would break when the default behavior of a bare "git gc" is adjusted to generate a cruft pack by default. Ensure that these tests continue to work as-is (and continue to provide coverage of loose object pruning) by passing `--no-cruft` explicitly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19	builtin/gc.c: ignore cruft packs with `--keep-largest-pack`	Taylor Blau
	When cruft packs were implemented, we never adjusted the code for `git gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft packs. This option and configuration option share a common implementation, but including cruft packs is wrong in both cases: - Running `git gc --keep-largest-pack` in a repository where the largest pack is the cruft pack itself will make it impossible for `git gc` to prune objects, since the cruft pack itself is kept. - The same is true for `gc.bigPackThreshold`, if the size of the cruft pack exceeds the limit set by the caller. In the future, it is possible that `gc.bigPackThreshold` could be used to write a separate cruft pack containing any new unreachable objects that entered the repository since the last time a cruft pack was written. There are some complexities to doing so, mainly around handling pruning objects that are in an existing cruft pack that is above the threshold (which would either need to be rewritten, or else delay pruning). Rewriting a substantially similar cruft pack isn't ideal, but it is significantly better than the status-quo. If users have large cruft packs that they don't want to rewrite, they can mark them as `*.keep` packs. But in general, if a repository has a cruft pack that is so large it is slowing down GC's, it should probably be pruned anyway. In the meantime, ignore cruft packs in the common implementation for both of these options, and add a pair of tests to prevent any future regressions here. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18	Merge branch 'pw/rebase-cleanup-merge-strategy-option-handling'	Junio C Hamano
	Clean-up of the code path that deals with merge strategy option handling in "git rebase". * pw/rebase-cleanup-merge-strategy-option-handling: rebase: remove a couple of redundant strategy tests rebase -m: fix serialization of strategy options rebase -m: cleanup --strategy-option handling sequencer: use struct strvec to store merge strategy options rebase: stop reading and writing unnecessary strategy state
2023-04-18	Merge branch 'tk/mergetool-gui-default-config'	Junio C Hamano
	"git mergetool" and "git difftool" learns a new configuration guiDefault to optionally favor configured guitool over non-gui-tool automatically when $DISPLAY is set. * tk/mergetool-gui-default-config: mergetool: new config guiDefault supports auto-toggling gui by DISPLAY
2023-04-18	Merge branch 'sl/sparse-write-tree'	Junio C Hamano
	"git write-tree" learns to work better with sparse-index. * sl/sparse-write-tree: write-tree: integrate with sparse index
2023-04-18	fsck: validate .rev file header	Derrick Stolee
	While parsing a .rev file, we check the header information to be sure it makes sense. This happens before doing any additional validation such as a checksum or value check. In order to differentiate between a bad header and a non-existent file, we need to update the API for loading a reverse index. Make load_pack_revindex_from_disk() non-static and specify that a positive value means "the file does not exist" while other errors during parsing are negative values. Since an invalid header prevents setting up the structures we would use for further validations, we can stop at that point. The place where we can distinguish between a missing file and a corrupt file is inside load_revindex_from_disk(), which is used both by pack rev-indexes and multi-pack-index rev-indexes. Some tests in t5326 demonstrate that it is critical to take some conditions to allow positive error signals. Add tests that check the three header values. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18	fsck: check rev-index position values	Derrick Stolee
	When checking a rev-index file, it may be helpful to identify exactly which positions are incorrect. Compare the rev-index to a freshly-computed in-memory rev-index and report the comparison failures. This additional check (on top of the checksum validation) can help find files that were corrupt by a single bit flip on-disk or perhaps were written incorrectly due to a bug in Git. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18	fsck: check rev-index checksums	Derrick Stolee
	The previous change added calls to verify_pack_revindex() in builtin/fsck.c, but the implementation of the method was left empty. Add the first and most-obvious check to this method: checksum verification. While here, create a helper method in the test script that makes it easy to adjust the .rev file and check that 'git fsck' reports the correct error message. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18	fsck: create scaffolding for rev-index checks	Derrick Stolee
	The 'fsck' builtin checks many of Git's on-disk data structures, but does not currently validate the pack rev-index files (a .rev file to pair with a .pack and .idx file). Before doing a more-involved check process, create the scaffolding within builtin/fsck.c to have a new error type and add that error type when the API method verify_pack_revindex() returns an error. That method does nothing currently, but we will add checks to it in later changes. For now, check that 'git fsck' succeeds without any errors in the normal case. Future checks will be paired with tests that corrupt the .rev file appropriately. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>