Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-13pack-refs: teach pack-refs --include optionJohn Cai
Allow users to be more selective over which refs to pack by adding an --include option to git-pack-refs. The existing options allow some measure of selectivity. By default git-pack-refs packs all tags. --all can be used to include all refs, and the previous commit added the ability to exclude certain refs with --exclude. While these knobs give the user some selection over which refs to pack, it could be useful to give more control. For instance, a repository may have a set of branches that are rarely updated and would benefit from being packed. --include would allow the user to easily include a set of branches to be packed while leaving everything else unpacked. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-13pack-refs: teach --exclude option to exclude refs from being packedJohn Cai
At GitLab, we have a system that creates ephemeral internal refs that don't live long before getting deleted. Having an option to exclude certain refs from a packed-refs file allows these internal references to be deleted much more efficiently. Add an --exclude option to the pack-refs builtin, and use the ref exclusions API to exclude certain refs from being packed into the final packed-refs file Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10merge-tree: load default git configDerrick Stolee
The 'git merge-tree' command handles creating root trees for merges without using the worktree. This is a critical operation in many Git hosts, as they typically store bare repositories. This builtin does not load the default Git config, which can have several important ramifications. In particular, one config that is loaded by default is core.useReplaceRefs. This is typically disabled in Git hosts due to the ability to spoof commits in strange ways. Since this config is not loaded specifically during merge-tree, users were previously able to use refs/replace/ references to make pull requests that looked valid but introduced malicious content. The resulting merge commit would have the correct commit history, but the malicious content would exist in the root tree of the merge. The fix is simple: load the default Git config in cmd_merge_tree(). This may also fix other behaviors that are effected by reading default config. The only possible downside is a little extra computation time spent reading config. The config parsing is placed after basic argument parsing so it does not slow down usage errors. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: introduce machine-parseable "porcelain" output formatPatrick Steinhardt
The output of git-fetch(1) is obviously designed for consumption by users, only: we neatly columnize data, we abbreviate reference names, we print neat arrows and we don't provide information about actual object IDs that have changed. This makes the output format basically unusable in the context of scripted invocations of git-fetch(1) that want to learn about the exact changes that the command performs. Introduce a new machine-parseable "porcelain" output format that is supposed to fix this shortcoming. This output format is intended to provide information about every reference that is about to be updated, the old object ID that the reference has been pointing to and the new object ID it will be updated to. Furthermore, the output format provides the same flags as the human-readable format to indicate basic conditions for each reference update like whether it was a fast-forward update, a branch deletion, a rejected update or others. The output format is quite simple: ``` <flag> <old-object-id> <new-object-id> <local-reference>\n ``` We assume two conditions which are generally true: - The old and new object IDs have fixed known widths and cannot contain spaces. - References cannot contain newlines. With these assumptions, the output format becomes unambiguously parseable. Furthermore, given that this output is designed to be consumed by scripts, the machine-readable data is printed to stdout instead of stderr like the human-readable output is. This is mostly done so that other data printed to stderr, like error messages or progress meters, don't interfere with the parseable data. A notable ommission here is that the output format does not include the remote from which a reference was fetched, which might be important information especially in the context of multi-remote fetches. But as such a format would require us to print the remote for every single reference update due to parallelizable fetches it feels wasteful for the most likely usecase, which is when fetching from a single remote. In a similar spirit, a second restriction is that this cannot be used with `--recurse-submodules`. This is because any reference updates would be ambiguous without also printing the repository in which the update happens. Considering that both multi-remote and submodule fetches are user-facing features, using them in conjunction with `--porcelain` that is intended for scripting purposes is likely not going to be useful in the majority of cases. With that in mind these restrictions feel acceptable. If usecases for either of these come up in the future though it is easy enough to add a new "porcelain-v2" format that adds this information. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: move option related variables into main functionPatrick Steinhardt
The options of git-fetch(1) which we pass to `parse_options()` are declared globally in `builtin/fetch.c`. This means we're forced to use global variables for all the options, which is more likely to cause confusion than explicitly passing state around. Refactor the code to move the options into `cmd_fetch()`. Move variables that were previously forced to be declared globally and which are only used by `cmd_fetch()` into function-local scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: lift up parsing of "fetch.output" config variablePatrick Steinhardt
Parsing the display format happens inside of `display_state_init()`. As we only need to check for a simple config entry, this is a natural location to put this code as it means that display-state logic is neatly contained in a single location. We're about to introduce a new "porcelain" output format though that is intended to be parseable by machines, for example inside of a script. This format can be enabled by passing the `--porcelain` switch to git-fetch(1). As a consequence, we'll have to add a second callsite that influences the output format, which will become awkward to handle. Refactor the code such that callers are expected to pass the display format that is to be used into `display_state_init()`. This allows us to lift up the code into the main function, where we can then hook it into command line options parser in a follow-up commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: introduce `display_format` enumPatrick Steinhardt
We currently have two different display formats in git-fetch(1) with the "full" and "compact" formats. This is tracked with a boolean value that simply denotes whether the display format is supposed to be compacted or not. This works reasonably well while there are only two formats, but we're about to introduce another format that will make this a bit more awkward to use. Introduce a `enum display_format` that is more readily extensible. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: refactor calculation of the display table widthPatrick Steinhardt
When displaying reference updates, we try to print the references in a neat table. As the table's width is determined its contents we thus need to precalculate the overall width before we can start printing updated references. The calculation is driven by `display_state_init()`, which invokes `refcol_width()` for every reference that is to be printed. This split is somewhat confusing. For one, we filter references that shall be attributed to the overall width in both places. And second, we needlessly recalculate the maximum line length based on the terminal columns and display format for every reference. Refactor the code so that the complete width calculations are neatly contained in `refcol_width()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: print left-hand side when fetching HEAD:fooPatrick Steinhardt
`store_updated_refs()` parses the remote reference for two purposes: - It gets used as a note when writing FETCH_HEAD. - It is passed through to `display_ref_update()` to display updated references in the following format: ``` * branch master -> master ``` In most cases, the parsed remote reference is the prettified reference name and can thus be used for both cases. But if the remote reference is HEAD, the parsed remote reference becomes empty. This is intended when we write the FETCH_HEAD, where we skip writing the note in that case. But when displaying the updated references this leads to inconsistent output where the left-hand side of reference updates is missing in some cases: ``` $ git fetch origin HEAD HEAD:explicit-head :implicit-head main From https://github.com/git/git * branch HEAD -> FETCH_HEAD * [new ref] -> explicit-head * [new ref] -> implicit-head * branch main -> FETCH_HEAD ``` This behaviour has existed ever since the table-based output has been introduced for git-fetch(1) via 165f390250 (git-fetch: more terse fetch output, 2007-11-03) and was never explicitly documented either in the commit message or in any of our tests. So while it may not be a bug per se, it feels like a weird inconsistency and not like it was a concious design decision. The logic of how we compute the remote reference name that we ultimately pass to `display_ref_update()` is not easy to follow. There are three different cases here: - When the remote reference name is "HEAD" we set the remote reference name to the empty string. This is the case that causes the left-hand side to go missing, where we would indeed want to print "HEAD" instead of the empty string. This is what `prettify_refname()` would return. - When the remote reference name has a well-known prefix then we strip this prefix. This matches what `prettify_refname()` does. - Otherwise, we keep the fully qualified reference name. This also matches what `prettify_refname()` does. As the return value of `prettify_refname()` would do the correct thing for us in all three cases, we can thus fix the inconsistency by passing through the full remote reference name to `display_ref_update()`, which learns to call `prettify_refname()`. At the same time, this also simplifies the code a bit. Note that this patch also changes formatting of the block that computes the "kind" (which is the category like "branch" or "tag") and "what" (which is the prettified reference name like "master" or "v1.0") variables. This is done on purpose so that it is part of the diff, hopefully making the change easier to comprehend. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: fix `--no-recurse-submodules` with multi-remote fetchesPatrick Steinhardt
When running `git fetch --no-recurse-submodules`, the exectation is that we don't fetch any submodules. And while this works for fetches of a single remote, it doesn't when fetching multiple remotes at once. The result is that we do recurse into submodules even though the user has explicitly asked us not to. This is because while we pass on `--recurse-submodules={yes,on-demand}` if specified by the user, we don't pass on `--no-recurse-submodules` to the subprocess spawned to perform the submodule fetch. Fix this by also forwarding this flag as expected. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10Merge branch 'mh/credential-oauth-refresh-token'Junio C Hamano
The credential subsystem learns to help OAuth framework. * mh/credential-oauth-refresh-token: credential: new attribute oauth_refresh_token
2023-05-10Merge branch 'ob/messages-capitalize-exception'Junio C Hamano
Message update. * ob/messages-capitalize-exception: messages: capitalization and punctuation exceptions
2023-05-10Merge branch 'en/header-split-cache-h-part-2'Junio C Hamano
More header clean-up. * en/header-split-cache-h-part-2: (22 commits) reftable: ensure git-compat-util.h is the first (indirect) include diff.h: reduce unnecessary includes object-store.h: reduce unnecessary includes commit.h: reduce unnecessary includes fsmonitor: reduce includes of cache.h cache.h: remove unnecessary headers treewide: remove cache.h inclusion due to previous changes cache,tree: move basic name compare functions from read-cache to tree cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c hash-ll.h: split out of hash.h to remove dependency on repository.h tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h dir.h: move DTYPE defines from cache.h versioncmp.h: move declarations for versioncmp.c functions from cache.h ws.h: move declarations for ws.c functions from cache.h match-trees.h: move declarations for match-trees.c functions from cache.h pkt-line.h: move declarations for pkt-line.c functions from cache.h base85.h: move declarations for base85.c functions from cache.h copy.h: move declarations for copy.c functions from cache.h server-info.h: move declarations for server-info.c functions from cache.h packfile.h: move pack_window and pack_entry from cache.h ...
2023-05-10diff-files: integrate with sparse indexShuqi Liang
Remove full index requirement for `git diff-files`. Refactor the ensure_expanded and ensure_not_expanded functions by introducing a common helper function, ensure_index_state. Add test to ensure the index is no expanded in `git diff-files`. The `p2000` tests demonstrate a ~96% execution time reduction for 'git diff-files' and a ~97% execution time reduction for 'git diff-files' for a file using a sparse index: Test before after ----------------------------------------------------------------------- 2000.94: git diff-files (full-v3) 0.09 0.08 -11.1% 2000.95: git diff-files (full-v4) 0.09 0.09 +0.0% 2000.96: git diff-files (sparse-v3) 0.52 0.02 -96.2% 2000.97: git diff-files (sparse-v4) 0.51 0.02 -96.1% 2000.98: git diff-files -- f2/f4/a (full-v3) 0.06 0.07 +16.7% 2000.99: git diff-files -- f2/f4/a (full-v4) 0.08 0.08 +0.0% 2000.100: git diff-files -- f2/f4/a (sparse-v3) 0.46 0.01 -97.8% 2000.101: git diff-files -- f2/f4/a (sparse-v4) 0.51 0.02 -96.1% Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-07push: introduce '--branches' optionTeng Long
The '--all' option of git-push built-in cmd support to push all branches (refs under refs/heads) to remote. Under the usage, a user can easlily work in some scenarios, for example, branches synchronization and batch upload. The '--all' was introduced for a long time, meanwhile, git supports to customize the storage location under "refs/". when a new git user see the usage like, 'git push origin --all', we might feel like we're pushing _all_ the refs instead of just branches without looking at the documents until we found the related description of it or '--mirror'. To ensure compatibility, we cannot rename '--all' to another name directly, one way is, we can try to add a new option '--heads' which be identical with the functionality of '--all' to let the user understand the meaning of representation more clearly. Actually, We've more or less named options this way already, for example, in 'git-show-ref' and 'git ls-remote'. At the same time, we fix a related issue about the wrong help information of '--all' option in code and add some test cases in t5523, t5543 and t5583. Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-07attr: teach "--attr-source=<tree>" global option to "git"John Cai
Earlier, 47cfc9bd (attr: add flag `--source` to work with tree-ish, 2023-01-14) taught "git check-attr" the "--source=<tree>" option to allow it to read attribute files from a tree-ish, but did so only for the command. Just like "check-attr" users wanted a way to use attributes from a tree-ish and not from the working tree files, users of other commands (like "git diff") would benefit from the same. Undo most of the UI change the commit made, while keeping the internal logic to read attributes from a given tree-ish. Expose the internal logic via a new "--attr-source=<tree>" command line option given to "git", so that it can be used with any git command that runs as part of the main git process. Additionally, add an environment variable GIT_ATTR_SOURCE that is set when --attr-source is passed in, so that subprocesses use the same value for the attributes source tree. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-07name-rev: make --stdin hiddenJohn Cai
In 34ae3b70 (name-rev: deprecate --stdin in favor of --annotate-stdin), we renamed --stdin to --annotate-stdin for the sake of a clearer name for the option, and added text that indicates --stdin is deprecated. The next step is to hide --stdin completely. Make the option hidden. Also, update documentation to remove all mentions of --stdin. Signed-off-by: "John Cai" <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02Merge branch 'tb/ban-strtok'Junio C Hamano
Mark strtok() and strtok_r() to be banned. * tb/ban-strtok: banned.h: mark `strtok()` and `strtok_r()` as banned t/helper/test-json-writer.c: avoid using `strtok()` t/helper/test-oidmap.c: avoid using `strtok()` t/helper/test-hashmap.c: avoid using `strtok()` string-list: introduce `string_list_setlen()` string-list: multi-delimiter `string_list_split_in_place()`
2023-05-02fsck: use local repositoryDerrick Stolee
In 0d30feef3c5 (fsck: create scaffolding for rev-index checks, 2023-04-17) and later 5a6072f631d (fsck: validate .rev file header, 2023-04-17), the check_pack_rev_indexes() method was created with a 'struct repository *r' parameter. However, this parameter was unused and instead 'the_repository' was used in its place. Fix this situation with the obvious replacement. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02fsck: verify checksums of all .bitmap filesDerrick Stolee
If a filesystem-level corruption occurs in a .bitmap file, Git can react poorly. This could take the form of a run-time error due to failing to parse an EWAH bitmap or be more subtle such as returning the wrong set of objects to a fetch or clone. A natural first response to either of these kinds of errors is to run 'git fsck' to see if any files are corrupt. This currently ignores all .bitmap files. Add checks to 'git fsck' for all .bitmap files that are currently associated with a multi-pack-index or pack file. Verify their checksums using the hashfile API. We iterate through all multi-pack-indexes and pack-files to be sure to check all .bitmap files, not just the one that would be read by the process. For example, a multi-pack-index bitmap overrules a pack-bitmap. However, if the multi-pack-index is removed, the pack-bitmap may be selected instead. Be thorough to include every file that could become active in such a way. This includes checking files in alternates. There is potential that we could extend this effort to check the structure of the reachability bitmaps themselves, but it is very expensive to do so. At minimum, it's as expensive as generating the bitmaps in the first place, and that's assuming that we don't use the trivial algorithm of verifying each bitmap individually. The trivial algorithm will result in quadratic behavior (number of objects times number of bitmapped commits) while the bitmap building operation constructs a lattice of commits to build bitmaps incrementally and then generate the final bitmaps from a subset of those commits. If we were to extend 'git fsck' to check .bitmap file contents more closely like this, then we would likely want to hide it behind an option that signals the user is more willing to do expensive operations such as this. For testing, set up a repository with a pack-bitmap _and_ a multi-pack-index bitmap. This requires some file movement to avoid deleting the pack-bitmap during the repack that creates the multi-pack-index bitmap. We can then verify that 'git fsck' is checking all files, not just the "active" bitmap. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-29Merge branch 'tb/enable-cruft-packs-by-default'Junio C Hamano
When "gc" needs to retain unreachable objects, packing them into cruft packs (instead of exploding them into loose object files) has been offered as a more efficient option for some time. Now the use of cruft packs has been made the default and no longer considered an experimental feature. * tb/enable-cruft-packs-by-default: repository.h: drop unused `gc_cruft_packs` builtin/gc.c: make `gc.cruftPacks` enabled by default t/t9300-fast-import.sh: prepare for `gc --cruft` by default t/t6500-gc.sh: add additional test cases t/t6500-gc.sh: refactor cruft pack tests t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default t/t5304-prune.sh: prepare for `gc --cruft` by default builtin/gc.c: ignore cruft packs with `--keep-largest-pack` builtin/repack.c: fix incorrect reference to '-C' pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-28messages: capitalization and punctuation exceptionsOswald Buddenhagen
These are conscious violations of the usual rules for error messages, based on this reasoning: - If an error message is directly followed by another sentence, it needs to be properly terminated with a period, lest the grammar looks broken and becomes hard to read. - That second sentence isn't actually an error message any more, so it should abide to conventional language rules for good looks and legibility. Arguably, these should be converted to advice messages (which the user can squelch, too), but that's a much bigger effort to get right. - Neither of these apply to the first hunk in do_exec(), but this two-line message looks just too much like a real sentence to not terminate it. Also, leaving it alone would make it asymmetrical to the other hunk. Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-28Merge branch 'ds/fsck-pack-revindex'Junio C Hamano
"git fsck" learned to validate the on-disk pack reverse index files. * ds/fsck-pack-revindex: fsck: validate .rev file header fsck: check rev-index position values fsck: check rev-index checksums fsck: create scaffolding for rev-index checks
2023-04-28Merge branch 'tb/pack-revindex-on-disk'Junio C Hamano
The on-disk reverse index that allows mapping from the pack offset to the object name for the object stored at the offset has been enabled by default. * tb/pack-revindex-on-disk: t: invert `GIT_TEST_WRITE_REV_INDEX` config: enable `pack.writeReverseIndex` by default pack-revindex: introduce `pack.readReverseIndex` pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK pack-revindex: make `load_pack_revindex` take a repository t5325: mark as leak-free pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-25Merge branch 'ps/fix-geom-repack-with-alternates'Junio C Hamano
Geometric repacking ("git repack --geometric=<n>") in a repository that borrows from an alternate object database had various corner case bugs, which have been corrected. * ps/fix-geom-repack-with-alternates: repack: disable writing bitmaps when doing a local repack repack: honor `-l` when calculating pack geometry t/helper: allow chmtime to print verbosely without modifying mtime pack-objects: extend test coverage of `--stdin-packs` with alternates pack-objects: fix error when same packfile is included and excluded pack-objects: fix error when packing same pack twice pack-objects: split out `--stdin-packs` tests into separate file repack: fix generating multi-pack-index with only non-local packs repack: fix trying to use preferred pack in alternates midx: fix segfault with no packs and invalid preferred pack
2023-04-25Merge branch 'jk/protocol-cap-parse-fix'Junio C Hamano
The code to parse capability list for v0 on-wire protocol fell into an infinite loop when a capability appears multiple times, which has been corrected. * jk/protocol-cap-parse-fix: v0 protocol: use size_t for capability length/offset t5512: test "ls-remote --heads --symref" filtering with v0 and v2 t5512: allow any protocol version for filtered symref test t5512: add v2 support for "ls-remote --symref" test v0 protocol: fix sha1/sha256 confusion for capabilities^{} t5512: stop referring to "v1" protocol v0 protocol: fix infinite loop when parsing multi-valued capabilities
2023-04-25Merge branch 'en/header-split-cache-h'Junio C Hamano
Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...
2023-04-25string-list: multi-delimiter `string_list_split_in_place()`Taylor Blau
Enhance `string_list_split_in_place()` to accept multiple characters as delimiters instead of a single character. Instead of using `strchr(2)` to locate the first occurrence of the given delimiter character, `string_list_split_in_place_multi()` uses `strcspn(2)` to move past the initial segment of characters comprised of any characters in the delimiting set. When only a single delimiting character is provided, `strpbrk(2)` (which is implemented with `strcspn(2)`) has equivalent performance to `strchr(2)`. Modern `strcspn(2)` implementations treat an empty delimiter or the singleton delimiter as a special case and fall back to calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)` this way. This change is one step to removing `strtok(2)` from the tree. Note that `string_list_split_in_place()` is not a strict replacement for `strtok()`, since it will happily turn sequential delimiter characters into empty entries in the resulting string_list. For example: string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1) would yield a string list of: ["foo", "", "", "bar", "", "", "baz"] Callers that wish to emulate the behavior of strtok(2) more directly should call `string_list_remove_empty_items()` after splitting. To avoid regressions for the new multi-character delimter cases, update t0063 in this patch as well. [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35 [2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11 Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24commit.h: reduce unnecessary includesElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24hash-ll.h: split out of hash.h to remove dependency on repository.hElijah Newren
hash.h depends upon and includes repository.h, due to the definition and use of the_hash_algo (defined as the_repository->hash_algo). However, most headers trying to include hash.h are only interested in the layout of the structs like object_id. Move the parts of hash.h that do not depend upon repository.h into a new file hash-ll.h (the "low level" parts of hash.h), and adjust other files to use this new header where the convenience inline functions aren't needed. This allows hash.h and object.h to be fairly small, minimal headers. It also exposes a lot of hidden dependencies on both path.h (which was brought in by repository.h) and repository.h (which was previously implicitly brought in by object.h), so also adjust other files to be more explicit about what they depend upon. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24pkt-line.h: move declarations for pkt-line.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24copy.h: move declarations for copy.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24server-info.h: move declarations for server-info.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24symlinks.h: move declarations for symlinks.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-22Merge branch 'ow/ref-filter-omit-empty'Junio C Hamano
"git branch --format=..." and "git format-patch --format=..." learns "--omit-empty" to hide refs that whose formatting result becomes an empty string from the output. * ow/ref-filter-omit-empty: branch, for-each-ref, tag: add option to omit empty lines
2023-04-22Merge branch 'rn/sparse-describe'Junio C Hamano
"git describe --dirty" learns to work better with sparse-index. * rn/sparse-describe: describe: enable sparse index for describe
2023-04-21credential: new attribute oauth_refresh_tokenM Hickford
Git authentication with OAuth access token is supported by every popular Git host including GitHub, GitLab and BitBucket [1][2][3]. Credential helpers Git Credential Manager (GCM) and git-credential-oauth generate OAuth credentials [4][5]. Following RFC 6749, the application prints a link for the user to authorize access in browser. A loopback redirect communicates the response including access token to the application. For security, RFC 6749 recommends that OAuth response also includes expiry date and refresh token [6]. After expiry, applications can use the refresh token to generate a new access token without user reauthorization in browser. GitLab and BitBucket set the expiry at two hours [2][3]. (GitHub doesn't populate expiry or refresh token.) However the Git credential protocol has no attribute to store the OAuth refresh token (unrecognised attributes are silently discarded). This means that the user has to regularly reauthorize the helper in browser. On a browserless system, this is particularly intrusive, requiring a second device. Introduce a new attribute oauth_refresh_token. This is especially useful when a storage helper and a read-only OAuth helper are configured together. Recall that `credential fill` calls each helper until it has a non-expired password. ``` [credential] helper = storage # eg. cache or osxkeychain helper = oauth ``` The OAuth helper can use the stored refresh token forwarded by `credential fill` to generate a fresh access token without opening the browser. See https://github.com/hickford/git-credential-oauth/pull/3/files for an implementation tested with this patch. Add support for the new attribute to credential-cache. Eventually, I hope to see support in other popular storage helpers. Alternatives considered: ask helpers to store all unrecognised attributes. This seems excessively complex for no obvious gain. Helpers would also need extra information to distinguish between confidential and non-confidential attributes. Workarounds: GCM abuses the helper get/store/erase contract to store the refresh token during credential *get* as the password for a fictitious host [7] (I wrote this hack). This workaround is only feasible for a monolithic helper with its own storage. [1] https://github.blog/2012-09-21-easier-builds-and-deployments-using-git-over-https-and-oauth/ [2] https://docs.gitlab.com/ee/api/oauth2.html#access-git-over-https-with-access-token [3] https://support.atlassian.com/bitbucket-cloud/docs/use-oauth-on-bitbucket-cloud/#Cloning-a-repository-with-an-access-token [4] https://github.com/GitCredentialManager/git-credential-manager [5] https://github.com/hickford/git-credential-oauth [6] https://datatracker.ietf.org/doc/html/rfc6749#section-5.1 [7] https://github.com/GitCredentialManager/git-credential-manager/blob/66b94e489ad8cc1982836355493e369770b30211/src/shared/GitLab/GitLabHostProvider.cs#L207 Signed-off-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-21Merge branch 'gc/better-error-when-local-clone-fails-with-symlink'Junio C Hamano
"git clone --local" stops copying from an original repository that has symbolic links inside its $GIT_DIR; an error message when that happens has been updated. * gc/better-error-when-local-clone-fails-with-symlink: clone: error specifically with --local and symlinked objects
2023-04-21Merge branch 'rs/get-tar-commit-id-use-defined-const'Junio C Hamano
Code clean-up to replace a hardcoded constant with a CPP macro. * rs/get-tar-commit-id-use-defined-const: get-tar-commit-id: use TYPEFLAG_GLOBAL_HEADER instead of magic value
2023-04-19builtin/gc.c: make `gc.cruftPacks` enabled by defaultTaylor Blau
Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects via loose, 2022-05-20), `git gc` learned the `--cruft` option and `gc.cruftPacks` configuration to opt-in to writing cruft packs when collecting or pruning unreachable objects. Cruft packs were introduced with the merge in a50036da1a (Merge branch 'tb/cruft-packs', 2022-06-03). They address the problem of "loose object explosions", where Git will write out many individual loose objects when there is a large number of unreachable objects that have not yet aged past `--prune=<date>`. Instead of keeping track of those unreachable yet recent objects via their loose object file's mtime, cruft packs collect all unreachable objects into a single pack with a corresponding `*.mtimes` file that acts as a table to store the mtimes of all unreachable objects. This prevents the need to store unreachable objects as loose as they age out of the repository, and avoids the problem of loose object explosions. Beyond avoiding loose object explosions, cruft packs also act as a more efficient mechanism to store unreachable objects as they age out of a repository. This is because pairs of similar unreachable objects serve as delta bases for one another. In 5b92477f89, the feature was introduced as experimental. Since then, GitHub has been running these patches in every repository generating hundreds of millions of cruft packs along the way. The feature is battle-tested, and avoids many pathological cases such as above. Users who either run `git gc` manually, or via `git maintenance` can benefit from having cruft packs. As such, enable cruft pack generation to take place by default (by making `gc.cruftPacks` have the default of "true" rather than "false). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19builtin/gc.c: ignore cruft packs with `--keep-largest-pack`Taylor Blau
When cruft packs were implemented, we never adjusted the code for `git gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft packs. This option and configuration option share a common implementation, but including cruft packs is wrong in both cases: - Running `git gc --keep-largest-pack` in a repository where the largest pack is the cruft pack itself will make it impossible for `git gc` to prune objects, since the cruft pack itself is kept. - The same is true for `gc.bigPackThreshold`, if the size of the cruft pack exceeds the limit set by the caller. In the future, it is possible that `gc.bigPackThreshold` could be used to write a separate cruft pack containing any new unreachable objects that entered the repository since the last time a cruft pack was written. There are some complexities to doing so, mainly around handling pruning objects that are in an existing cruft pack that is above the threshold (which would either need to be rewritten, or else delay pruning). Rewriting a substantially similar cruft pack isn't ideal, but it is significantly better than the status-quo. If users have large cruft packs that they don't want to rewrite, they can mark them as `*.keep` packs. But in general, if a repository has a cruft pack that is so large it is slowing down GC's, it should probably be pruned anyway. In the meantime, ignore cruft packs in the common implementation for both of these options, and add a pair of tests to prevent any future regressions here. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-19builtin/repack.c: fix incorrect reference to '-C'Taylor Blau
When cruft packs were originally being developed, `-C` was designated as the short-form for `--cruft` (as in `git repack -C`). This was dropped due to confusion with Git's top-level `-C` option before submitting to the list. But the reference to it in `--cruft-expiration`'s help text was never updated. Fix that dangling reference in this patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18Merge branch 'pw/rebase-cleanup-merge-strategy-option-handling'Junio C Hamano
Clean-up of the code path that deals with merge strategy option handling in "git rebase". * pw/rebase-cleanup-merge-strategy-option-handling: rebase: remove a couple of redundant strategy tests rebase -m: fix serialization of strategy options rebase -m: cleanup --strategy-option handling sequencer: use struct strvec to store merge strategy options rebase: stop reading and writing unnecessary strategy state
2023-04-18Merge branch 'cm/branch-delete-error-message-update'Junio C Hamano
"git branch -d origin/master" would say "no such branch", but it is likely a missed "-r" if refs/remotes/origin/master exists. The command has been taught to give such a hint in its error message. * cm/branch-delete-error-message-update: branch: improve error log on branch not found by checking remotes refs
2023-04-18Merge branch 'tk/mergetool-gui-default-config'Junio C Hamano
"git mergetool" and "git difftool" learns a new configuration guiDefault to optionally favor configured guitool over non-gui-tool automatically when $DISPLAY is set. * tk/mergetool-gui-default-config: mergetool: new config guiDefault supports auto-toggling gui by DISPLAY
2023-04-18Merge branch 'sl/sparse-write-tree'Junio C Hamano
"git write-tree" learns to work better with sparse-index. * sl/sparse-write-tree: write-tree: integrate with sparse index
2023-04-18fsck: validate .rev file headerDerrick Stolee
While parsing a .rev file, we check the header information to be sure it makes sense. This happens before doing any additional validation such as a checksum or value check. In order to differentiate between a bad header and a non-existent file, we need to update the API for loading a reverse index. Make load_pack_revindex_from_disk() non-static and specify that a positive value means "the file does not exist" while other errors during parsing are negative values. Since an invalid header prevents setting up the structures we would use for further validations, we can stop at that point. The place where we can distinguish between a missing file and a corrupt file is inside load_revindex_from_disk(), which is used both by pack rev-indexes and multi-pack-index rev-indexes. Some tests in t5326 demonstrate that it is critical to take some conditions to allow positive error signals. Add tests that check the three header values. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18fsck: create scaffolding for rev-index checksDerrick Stolee
The 'fsck' builtin checks many of Git's on-disk data structures, but does not currently validate the pack rev-index files (a .rev file to pair with a .pack and .idx file). Before doing a more-involved check process, create the scaffolding within builtin/fsck.c to have a new error type and add that error type when the API method verify_pack_revindex() returns an error. That method does nothing currently, but we will add checks to it in later changes. For now, check that 'git fsck' succeeds without any errors in the normal case. Future checks will be paired with tests that corrupt the .rev file appropriately. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-15v0 protocol: use size_t for capability length/offsetJeff King
When parsing server capabilities, we use "int" to store lengths and offsets. At first glance this seems like a spot where our parser may be confused by integer overflow if somebody sent us a malicious response. In practice these strings are all bounded by the 64k limit of a pkt-line, so using "int" is OK. However, it makes the code simpler to audit if they just use size_t everywhere. Note that because we take these parameters as pointers, this also forces many callers to update their declared types. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14repack: disable writing bitmaps when doing a local repackPatrick Steinhardt
In order to write a bitmap, we need to have full coverage of all objects that are about to be packed. In the traditional non-multi-pack-index world this meant we need to do a full repack of all objects into a single packfile. But in the new multi-pack-index world we can get away with writing bitmaps when we have multiple packfiles as long as the multi-pack-index covers all objects. This is not always the case though. When asked to perform a repack of local objects, only, then we cannot guarantee to have full coverage of all objects regardless of whether we do a full repack or a repack with a multi-pack-index. The end result is that writing the bitmap will fail in both worlds: $ git multi-pack-index write --stdin-packs --bitmap <packfiles warning: Failed to write bitmap index. Packfile doesn't have full closure (object 1529341d78cf45377407369acb0f4ff2b5cdae42 is missing) error: could not write multi-pack bitmap Now there are two different ways to fix this. The first one would be to amend git-multi-pack-index(1) to disable writing bitmaps when we notice that we don't have full object coverage. - We don't have enough information in git-multi-pack-index(1) in order to tell whether the local repository _should_ have full coverage. Because even when connected to an alternate object directory, it may be the case that we still have all objects around in the main object database. - git-multi-pack-index(1) is quite a low-level tool. Automatically disabling functionality that it was asked to provide does not feel like the right thing to do. We can easily fix it at a higher level in git-repack(1) though. When asked to only include local objects via `-l` and when connected to an alternate object directory then we will override the user's ask and disable writing bitmaps with a warning. This is similar to what we do in git-pack-objects(1), where we also disable writing bitmaps in case we omit an object from the pack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>