Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-11-02tests: teach callers of test_i18ngrep to use test_grepJunio C Hamano
They are equivalents and the former still exists, so as long as the only change this commit makes are to rewrite test_i18ngrep to test_grep, there won't be any new bug, even if there still are callers of test_i18ngrep remaining in the tree, or when merged to other topics that add new uses of test_i18ngrep. This patch was produced more or less with git grep -l -e 'test_i18ngrep ' 't/t[0-9][0-9][0-9][0-9]-*.sh' | xargs perl -p -i -e 's/test_i18ngrep /test_grep /' and a good way to sanity check the result yourself is to run the above in a checkout of c4603c1c (test framework: further deprecate test_i18ngrep, 2023-10-31) and compare the resulting working tree contents with the result of applying this patch to the same commit. You'll see that test_i18ngrep in a few t/lib-*.sh files corrected, in addition to the manual reproduction. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26config: don't BUG when both kvi and source are setGlen Choo
When iterating through config, we read config source metadata from global values - either a "struct config_source + enum config_scope" or a "struct key_value_info", using the current_config* functions. Prior to the series starting from 0c60285147 (config.c: create config_reader and the_reader, 2023-03-28), we weren't very picky about which values we should read in which situation; we did note that both groups of values generally shouldn't be set together, but if both were set, current_config* preferentially reads key_value_info. When that series added more structure, we enforced that either the former (when parsing a config source) can be set, or the latter (when iterating a config set), but *never* both at the same time. See 9828453ff0 (config.c: remove current_config_kvi, 2023-03-28) and 5cdf18e7cd (config.c: remove current_parsing_scope, 2023-03-28). That was a good simplifying constraint that helped us reason about the global state, but it turns out that there is at least one situation where we need both to be set at the same time: in a blobless partial clone where .gitmodules is missing. "git fetch" in such a repo will start a config parse over .gitmodules (setting the config_source), and Git will attempt to lazy-fetch it from the promisor remote. However, when we try to read the promisor configuration, we start iterating a config set (setting the key_value_info), and we BUG() out because that's not allowed any more. Teaching config_reader to gracefully handle this is somewhat complicated, but fortunately, there are proposed changes to the config.c machinery to get rid of this global state, and make the BUG() obsolete [1]. We should rely on that as the eventual solution, and avoid doing yet another refactor in the meantime. Therefore, fix the bug by removing the BUG() check. We're reverting to an older, less safe state, but that's generally okay since key_value_info is always preferentially read, so we'd always read the correct values when we iterate a config set in the middle of a config parse (like we are here). The reverse would be wrong, but extremely unlikely to happen since very few callers parse config without going through a config set. [1] https://lore.kernel.org/git/pull.1497.v3.git.git.1687290231.gitgitgadget@gmail.com Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26read-tree: add "--super-prefix" option, eliminate globalÆvar Arnfjörð Bjarmason
The "--super-prefix" option to "git" was initially added in [1] for use with "ls-files"[2], and shortly thereafter "submodule--helper"[3] and "grep"[4]. It wasn't until [5] that "read-tree" made use of it. At the time [5] made sense, but since then we've made "ls-files" recurse in-process in [6], "grep" in [7], and finally "submodule--helper" in the preceding commits. Let's also remove it from "read-tree", which allows us to remove the option to "git" itself. We can do this because the only remaining user of it is the submodule API, which will now invoke "read-tree" with its new "--super-prefix" option. It will only do so when the "submodule_move_head()" function is called. That "submodule_move_head()" function was then only invoked by "read-tree" itself, but now rather than setting an environment variable to pass "--super-prefix" between cmd_read_tree() we: - Set a new "super_prefix" in "struct unpack_trees_options". The "super_prefixed()" function in "unpack-trees.c" added in [5] will now use this, rather than get_super_prefix() looking up the environment variable we set earlier in the same process. - Add the same field to the "struct checkout", which is only needed to ferry the "super_prefix" in the "struct unpack_trees_options" all the way down to the "entry.c" callers of "submodule_move_head()". Those calls which used the super prefix all originated in "cmd_read_tree()". The only other caller is the "unlink_entry()" caller in "builtin/checkout.c", which now passes a "NULL". 1. 74866d75793 (git: make super-prefix option, 2016-10-07) 2. e77aa336f11 (ls-files: optionally recurse into submodules, 2016-10-07) 3. 89c86265576 (submodule helper: support super prefix, 2016-12-08) 4. 0281e487fd9 (grep: optionally recurse into submodules, 2016-12-16) 5. 3d415425c7b (unpack-trees: support super-prefix option, 2017-01-17) 6. 188dce131fa (ls-files: use repository object, 2017-06-22) 7. f9ee2fcdfa0 (grep: recurse in-process using 'struct repository', 2017-08-02) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26read-tree + fetch tests: test failing "--super-prefix" interactionGlen Choo
Ever since "git fetch --refetch" was introduced in 0f5e8851737 (Merge branch 'rc/fetch-refetch', 2022-04-04) the test being added here would fail. This is because "restore" will "read-tree .. --reset <hash>", which will in turn invoke "fetch". The "fetch" will then die with: fatal: fetch doesn't support --super-prefix This edge case and other "--super-prefix" bugs will be fixed in subsequent commits, but let's first add a "test_expect_failure" test for it. It passes until the very last command in the test. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-07Sync with 2.36.3Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-07Sync with 2.35.5Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-07Sync with 2.34.5Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-07Sync with 2.31.5Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-07Sync with 2.30.6Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01t/t5NNN: allow local submodulesTaylor Blau
To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Tests that interact with submodules a handful of times use `test_config_global` instead. Test scripts that rely on submodules throughout use a `git config --global` during a setup test towards the beginning of the script. Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-09-13Merge branch 'jk/is-promisor-object-keep-tree-in-use' into maintJunio C Hamano
An earlier optimization discarded a tree-object buffer that is still in use, which has been corrected. * jk/is-promisor-object-keep-tree-in-use: is_promisor_object(): fix use-after-free of tree buffer
2022-08-15is_promisor_object(): fix use-after-free of tree bufferJeff King
Since commit fcc07e980b (is_promisor_object(): free tree buffer after parsing, 2021-04-13), we'll always free the buffers attached to a "struct tree" after searching them for promisor links. But there's an important case where we don't want to do so: if somebody else is already using the tree! This can happen during a "rev-list --missing=allow-promisor" traversal in a partial clone that is missing one or more trees or blobs. The backtrace for the free looks like this: #1 free_tree_buffer tree.c:147 #2 add_promisor_object packfile.c:2250 #3 for_each_object_in_pack packfile.c:2190 #4 for_each_packed_object packfile.c:2215 #5 is_promisor_object packfile.c:2272 #6 finish_object__ma builtin/rev-list.c:245 #7 finish_object builtin/rev-list.c:261 #8 show_object builtin/rev-list.c:274 #9 process_blob list-objects.c:63 #10 process_tree_contents list-objects.c:145 #11 process_tree list-objects.c:201 #12 traverse_trees_and_blobs list-objects.c:344 [...] We're in the middle of walking through the entries of a tree object via process_tree_contents(). We see a blob (or it could even be another tree entry) that we don't have, so we call is_promisor_object() to check it. That function loops over all of the objects in the promisor packfile, including the tree we're currently walking. When we're done with it there, we free the tree buffer. But as we return to the walk in process_tree_contents(), it's still holding on to a pointer to that buffer, via its tree_desc iterator, and it accesses the freed memory. Even a trivial use of "--missing=allow-promisor" triggers this problem, as the included test demonstrates (it's just a vanilla --blob:none clone). We can detect this case by only freeing the tree buffer if it was allocated on our behalf. This is a little tricky since that happens inside parse_object(), and it doesn't tell us whether the object was already parsed, or whether it allocated the buffer itself. But by checking for an already-parsed tree beforehand, we can distinguish the two cases. That feels a little hacky, and does incur an extra lookup in the object-hash table. But that cost is fairly minimal compared to actually loading objects (and since we're iterating the whole pack here, we're likely to be loading most objects, rather than reusing cached results). It may also be a good direction for this function in general, as there are other possible optimizations that rely on doing some analysis before parsing: - we could detect blobs and avoid reading their contents; they can't link to other objects, but parse_object() doesn't know that we don't care about checking their hashes. - we could avoid allocating object structs entirely for most objects (since we really only need them in the oidset), which would save some memory. - promisor commits could use the commit-graph rather than loading the object from disk This commit doesn't do any of those optimizations, but I think it argues that this direction is reasonable, rather than relying on parse_object() and trying to teach it to give us more information about whether it parsed. The included test fails reliably under SANITIZE=address just when running "rev-list --missing=allow-promisor". Checking the output isn't strictly necessary to detect the bug, but it seems like a reasonable addition given the general lack of coverage for "allow-promisor" in the test suite. Reported-by: Andrew Olsen <andrew.olsen@koordinates.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-28fetch: after refetch, encourage auto gc repackingRobert Coup
After invoking `fetch --refetch`, the object db will likely contain many duplicate objects. If auto-maintenance is enabled, invoke it with appropriate settings to encourage repacking/consolidation. * gc.autoPackLimit: unless this is set to 0 (disabled), override the value to 1 to force pack consolidation. * maintenance.incremental-repack.auto: unless this is set to 0, override the value to -1 to force incremental repacking. Signed-off-by: Robert Coup <robert@coup.net.nz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-28t5615-partial-clone: add test for fetch --refetchRobert Coup
Add a test for doing a refetch to apply a changed partial clone filter under protocol v0 and v2. Signed-off-by: Robert Coup <robert@coup.net.nz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13t5000-t5999: detect and signal failure within loopEric Sunshine
Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `|| return 1` (or `|| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13tests: fix broken &&-chains in compound statementsEric Sunshine
The top-level &&-chain checker built into t/test-lib.sh causes tests to magically exit with code 117 if the &&-chain is broken. However, it has the shortcoming that the magic does not work within `{...}` groups, `(...)` subshells, `$(...)` substitutions, or within bodies of compound statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed` partly fills in the gap by catching broken &&-chains in `(...)` subshells, but bugs can still lurk behind broken &&-chains in the other cases. Fix broken &&-chains in compound statements in order to reduce the number of possible lurking bugs. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28repack: avoid loosening promisor objects in partial clonesRafael Silva
When `git repack -A -d` is run in a partial clone, `pack-objects` is invoked twice: once to repack all promisor objects, and once to repack all non-promisor objects. The latter `pack-objects` invocation is with --exclude-promisor-objects and --unpack-unreachable, which loosens all objects unused during this invocation. Unfortunately, this includes promisor objects. Because the -d argument to `git repack` subsequently deletes all loose objects also in packs, these just-loosened promisor objects will be immediately deleted. However, this extra disk churn is unnecessary in the first place. For example, in a newly-cloned partial repo that filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up unpacking all trees and commits into the filesystem because every object, in this particular case, is a promisor object. Depending on the repo size, this increases the disk usage considerably: In my copy of the linux.git, the object directory peaked 26GB of more disk usage. In order to avoid this extra disk churn, pass the names of the promisor packfiles as --keep-pack arguments to the second invocation of `pack-objects`. This informs `pack-objects` that the promisor objects are already in a safe packfile and, therefore, do not need to be loosened. For testing, we need to validate whether any object was loosened. However, the "evidence" (loosened objects) is deleted during the process which prevents us from inspecting the object directory. Instead, let's teach `pack-objects` to count loosened objects and emit via trace2 thus allowing inspecting the debug events after the process is finished. This new event is used on the added regression test. Lastly, add a new perf test to evaluate the performance impact made by this changes (tested on git.git): Test HEAD^ HEAD ---------------------------------------------------------- 5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2% For a bigger repository, such as linux.git, the improvement is even bigger: Test HEAD^ HEAD ------------------------------------------------------------------- 5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1% These improvements are particular big because every object in the newly-cloned partial repository is a promisor object. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-26Merge branch 'js/default-branch-name-tests-final-stretch'Junio C Hamano
Prepare tests not to be affected by the name of the default branch "git init" creates. * js/default-branch-name-tests-final-stretch: (28 commits) tests: drop prereq `PREPARE_FOR_MAIN_BRANCH` where no longer needed t99*: adjust the references to the default branch name "main" tests(git-p4): transition to the default branch name `main` t9[5-7]*: adjust the references to the default branch name "main" t9[0-4]*: adjust the references to the default branch name "main" t8*: adjust the references to the default branch name "main" t7[5-9]*: adjust the references to the default branch name "main" t7[0-4]*: adjust the references to the default branch name "main" t6[4-9]*: adjust the references to the default branch name "main" t64*: preemptively adjust alignment to prepare for `master` -> `main` t6[0-3]*: adjust the references to the default branch name "main" t5[6-9]*: adjust the references to the default branch name "main" t55[4-9]*: adjust the references to the default branch name "main" t55[23]*: adjust the references to the default branch name "main" t551*: adjust the references to the default branch name "main" t550*: adjust the references to the default branch name "main" t5503: prepare aligned comment for replacing `master` with `main` t5[0-4]*: adjust the references to the default branch name "main" t5323: prepare centered comment for `master` -> `main` t4*: adjust the references to the default branch name "main" ...
2020-12-18Merge branch 'tb/partial-clone-filters-fix'Junio C Hamano
Fix potential server side resource deallocation issues when responding to a partial clone request. * tb/partial-clone-filters-fix: upload-pack.c: don't free allowed_filters util pointers builtin/clone.c: don't ignore transport_fetch_refs() errors
2020-12-03upload-pack.c: don't free allowed_filters util pointersTaylor Blau
To keep track of which object filters are allowed or not, 'git upload-pack' stores the name of each filter in a string_list, and sets it ->util pointer to be either 0 or 1, indicating whether it is banned or allowed. Later on, we attempt to clear that list, but we incorrectly ask for the util pointers to be free()'d, too. This behavior (introduced back in 6dd3456a8c (upload-pack.c: allow banning certain object filter(s), 2020-08-03)) leads to an invalid free, and causes us to crash. In order to trigger this, one needs to fetch from a server that (a) has at least one object filter allowed, and (b) issue a fetch that contains a subset of the allowed filters (i.e., we cannot ask for a banned filter, since this causes us to die() before we hit the bogus string_list_clear()). In that case, whatever banned filters exist will cause a noop free() (since those ->util pointers are set to 0), but the first allowed filter we try to free will crash us. We never noticed this in the tests because we didn't have an example of setting 'uploadPackFilter' configuration variables and then following up with a valid fetch. The first new 'git clone' prevents further regression here. For good measure on top, add a test which checks the same behavior at a tree depth greater than 0. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-03upload-pack: propagate return value from object filter config callbackJeff King
If we encounter an error in parse_filter_object_config(), we'll complain to stderr but won't actually propagate the return value up the stack. This is unlike most of our config callbacks, which return the error to git_config() so it can die (this includes the call just below us to parse_hide_refs_config(), which can also produce errors). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-20t5[6-9]*: adjust the references to the default branch name "main"Johannes Schindelin
This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t5[6-9]*.sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-20tests: mark tests relying on the current default for `init.defaultBranch`Johannes Schindelin
In addition to the manual adjustment to let the `linux-gcc` CI job run the test suite with `master` and then with `main`, this patch makes sure that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts that currently rely on the initial branch name being `master by default. To determine which test scripts to mark up, the first step was to force-set the default branch name to `master` in - all test scripts that contain the keyword `master`, - t4211, which expects `t/t4211/history.export` with a hard-coded ref to initialize the default branch, - t5560 because it sources `t/t556x_common` which uses `master`, - t8002 and t8012 because both source `t/annotate-tests.sh` which also uses `master`) This trick was performed by this command: $ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' $(git grep -l master t/t[0-9]*.sh) \ t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh After that, careful, manual inspection revealed that some of the test scripts containing the needle `master` do not actually rely on a specific default branch name: either they mention `master` only in a comment, or they initialize that branch specificially, or they do not actually refer to the current default branch. Therefore, the aforementioned modification was undone in those test scripts thusly: $ git checkout HEAD -- \ t/t0027-auto-crlf.sh t/t0060-path-utils.sh \ t/t1011-read-tree-sparse-checkout.sh \ t/t1305-config-include.sh t/t1309-early-config.sh \ t/t1402-check-ref-format.sh t/t1450-fsck.sh \ t/t2024-checkout-dwim.sh \ t/t2106-update-index-assume-unchanged.sh \ t/t3040-subprojects-basic.sh t/t3301-notes.sh \ t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \ t/t3436-rebase-more-options.sh \ t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \ t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \ t/t5511-refspec.sh t/t5526-fetch-submodules.sh \ t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \ t/t5548-push-porcelain.sh \ t/t5552-skipping-fetch-negotiator.sh \ t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \ t/t5614-clone-submodules-shallow.sh \ t/t7508-status.sh t/t7606-merge-custom.sh \ t/t9302-fast-import-unpack-limit.sh We excluded one set of test scripts in these commands, though: the range of `git p4` tests. The reason? `git p4` stores the (foreign) remote branch in the branch called `p4/master`, which is obviously not the default branch. Manual analysis revealed that only five of these tests actually require a specific default branch name to pass; They were modified thusly: $ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' t/t980[0167]*.sh t/t9811*.sh Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-03Merge branch 'jt/lazy-fetch'Junio C Hamano
Updates to on-demand fetching code in lazily cloned repositories. * jt/lazy-fetch: fetch: no FETCH_HEAD display if --no-write-fetch-head fetch-pack: remove no_dependents code promisor-remote: lazy-fetch objects in subprocess fetch-pack: do not lazy-fetch during ref iteration fetch: only populate existing_refs if needed fetch: avoid reading submodule config until needed fetch: allow refspecs specified through stdin negotiator/noop: add noop fetch negotiator
2020-08-20fetch-pack: in partial clone, pass --promisorJonathan Tan
When fetching a pack from a promisor remote, the corresponding .promisor file needs to be created. "fetch-pack" originally did this by passing "--promisor" to "index-pack", but in 5374a290aa ("fetch-pack: write fetched refs to .promisor", 2019-10-16), "fetch-pack" was taught to do this itself instead, because it needed to store ref information in the .promisor file. This causes a problem with superprojects when transfer.fsckobjects is set, because in the current implementation, it is "index-pack" that calls fsck_finish() to check the objects; before 5374a290aa, fsck_finish() would see that .gitmodules is a promisor object and tolerate it being missing, but after, there is no .promisor file (at the time of the invocation of fsck_finish() by "index-pack") to tell it that .gitmodules is a promisor object, so it returns an error. Therefore, teach "fetch-pack" to pass "--promisor" to index pack once again. "fetch-pack" will subsequently overwrite this file with the ref information. An alternative is to instead move object checking to "fetch-pack", and let "index-pack" only index the files. However, since "index-pack" has to inflate objects in order to index them, it seems reasonable to also let it check the objects (which also require inflated files). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18fetch-pack: do not lazy-fetch during ref iterationJonathan Tan
In order to determine negotiation tips, "fetch-pack" iterates over all refs and dereferences all annotated tags found. This causes the existence of targets of refs and annotated tags to be checked. Avoiding this is especially important when we use "git fetch" (which invokes "fetch-pack") to perform lazy fetches in a partial clone because a target of such a ref or annotated tag may need to be itself lazy-fetched (and otherwise causing an infinite loop). Therefore, teach "fetch-pack" not to lazy fetch whenever iterating over refs. This is done by using the raw form of ref iteration and by dereferencing tags ourselves. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-12Merge branch 'tb/upload-pack-filters'Junio C Hamano
The component to respond to "git fetch" request is made more configurable to selectively allow or reject object filtering specification used for partial cloning. * tb/upload-pack-filters: t5616: use test_i18ngrep for upload-pack errors upload-pack.c: introduce 'uploadpackfilter.tree.maxDepth' upload-pack.c: allow banning certain object filter(s) list_objects_filter_options: introduce 'list_object_filter_config_name'
2020-08-05t5616: use test_i18ngrep for upload-pack errorsJeff King
The tests added to t5616 in 6dd3456a8c (upload-pack.c: allow banning certain object filter(s), 2020-08-03) can fail racily, but only with GETTEXT_POISON enabled. The tests in question look something like this: test_must_fail ok=sigpipe git clone --filter=blob:none ... 2>err && grep "filter blob:none not supported' err The remote upload-pack process writes that error message both as an ERR packet, but also via a die() message. In theory we should see the message twice in the "err" file. The client relays the message from the packet to its stderr (with a "remote error:" prefix), and because this is a local-system clone, upload-pack's stderr goes to the same place. But because clone may be writing to the pipe when upload-pack calls die(), it may get SIGPIPE and fail to relay the message. That's why we need our "ok=sigpipe" trick. But our grep should still work reliably in that case. Either: - we got SIGPIPE on the client, which means upload-pack completed its die(), and we'll see that version of the message. - the client didn't get SIGPIPE, and so it successfully relays the message. In theory we'd see both copies of the message in the second case. But now always! As soon as the client sees ERR, it exits and we run grep. But we have no guarantee that the upload-pack process has exited at this point, or even written its die() message. We might only see the client version of the message. Normally that's OK. We only need to see one or the other to pass the test. But now consider GETTEXT_POISON. upload-pack doesn't translate the die() message nor the ERR packet. But once the client receives it, it calls: die(_("remote error: %s"), buffer + 4); That message _is_ marked for translation. Normally we'd just replace the "remote error:" portion of it, but in GETTEXT_POISON mode, we replace the whole thing with "# GETTEXT POISON #" and don't include the "%s" part at all. So the whole text from the ERR packet is dropped, and so we may racily see a test failure if upload-pack's die() call wasn't yet written. We can fix it by using test_i18ngrep, which just makes this grep a noop in the poison mode. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-04upload-pack.c: introduce 'uploadpackfilter.tree.maxDepth'Taylor Blau
In b79cf959b2 (upload-pack.c: allow banning certain object filter(s), 2020-02-26), we introduced functionality to disallow certain object filters from being chosen from within 'git upload-pack'. Traditionally, administrators use this functionality to disallow filters that are known to perform slowly, for e.g., those that do not have bitmap-level filtering. In the past, the '--filter=tree:<n>' was one such filter that does not have bitmap-level filtering support, and so was likely to be banned by administrators. However, in the previous couple of commits, we introduced bitmap-level filtering for the case when 'n' is equal to '0', i.e., as if we had a '--filter=tree:none' choice. While it would be sufficient to simply write $ git config uploadpackfilter.tree.allow true (since it would allow all values of 'n'), we would like to be able to allow this filter for certain values of 'n', i.e., those no greater than some pre-specified maximum. In order to do this, introduce a new configuration key, as follows: $ git config uploadpackfilter.tree.maxDepth <m> where '<m>' specifies the maximum allowed value of 'n' in the filter 'tree:n'. Administrators who wish to allow for only the value '0' can write: $ git config uploadpackfilter.tree.allow true $ git config uploadpackfilter.tree.maxDepth 0 which allows '--filter=tree:0', but no other values. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-04upload-pack.c: allow banning certain object filter(s)Taylor Blau
Git clients may ask the server for a partial set of objects, where the set of objects being requested is refined by one or more object filters. Server administrators can configure 'git upload-pack' to allow or ban these filters by setting the 'uploadpack.allowFilter' variable to 'true' or 'false', respectively. However, administrators using bitmaps may wish to allow certain kinds of object filters, but ban others. Specifically, they may wish to allow object filters that can be optimized by the use of bitmaps, while rejecting other object filters which aren't and represent a perceived performance degradation (as well as an increased load factor on the server). Allow configuring 'git upload-pack' to support object filters on a case-by-case basis by introducing two new configuration variables: - 'uploadpackfilter.allow' - 'uploadpackfilter.<kind>.allow' where '<kind>' may be one of 'blobNone', 'blobLimit', 'tree', and so on. Setting the second configuration variable for any valid value of '<kind>' explicitly allows or disallows restricting that kind of object filter. If a client requests the object filter <kind> and the respective configuration value is not set, 'git upload-pack' will default to the value of 'uploadpackfilter.allow', which itself defaults to 'true' to maintain backwards compatibility. Note that this differs from 'uploadpack.allowfilter', which controls whether or not the 'filter' capability is advertised. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-17upload-pack: do not lazy-fetch "have" objectsJonathan Tan
When upload-pack receives a request containing "have" hashes, it (among other things) checks if the served repository has the corresponding objects. However, it does not do so with the OBJECT_INFO_SKIP_FETCH_OBJECT flag, so if serving a partial clone, a lazy fetch will be triggered first. This was discovered at $DAYJOB when a user fetched from a partial clone (into another partial clone - although this would also happen if the repo to be fetched into is not a partial clone). Therefore, whenever "have" hashes are checked for existence, pass the OBJECT_INFO_SKIP_FETCH_OBJECT flag. Also add the OBJECT_INFO_QUICK flag to improve performance, as it is typical that such objects do not exist in the serving repo, and the consequences of a false negative are minor (usually, a slightly larger pack sent). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-13Merge branch 'cc/upload-pack-v2-fetch-fix'Junio C Hamano
Serving a "git fetch" client over "git://" and "ssh://" protocols using the on-wire protocol version 2 was buggy on the server end when the client needs to make a follow-up request to e.g. auto-follow tags. * cc/upload-pack-v2-fetch-fix: upload-pack: clear filter_options for each v2 fetch command
2020-05-08upload-pack: clear filter_options for each v2 fetch commandChristian Couder
Because of the request/response model of protocol v2, the upload_pack_v2() function is sometimes called twice in the same process, while 'struct list_objects_filter_options filter_options' was declared as static at the beginning of 'upload-pack.c'. This made the check in list_objects_filter_die_if_populated(), which is called by process_args(), fail the second time upload_pack_v2() is called, as filter_options had already been populated the first time. To fix that, filter_options is not static any more. It's now owned directly by upload_pack(). It's now also part of 'struct upload_pack_data', so that it's owned indirectly by upload_pack_v2(). In the long term, the goal is to also have upload_pack() use 'struct upload_pack_data', so adding filter_options to this struct makes more sense than to have it owned directly by upload_pack_v2(). This fixes the first of the 2 bugs documented by d0badf8797 (partial-clone: demonstrate bugs in partial fetch, 2020-02-21). Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-29Merge branch 'dd/test-with-busybox'Junio C Hamano
Various tests have been updated to work around issues found with shell utilities that come with busybox etc. * dd/test-with-busybox: t5703: feed raw data into test-tool unpack-sideband t4124: tweak test so that non-compliant diff(1) can also be used t7063: drop non-POSIX argument "-ls" from find(1) t5616: use rev-parse instead to get HEAD's object_id t5003: skip conversion test if unzip -a is unavailable t5003: drop the subshell in test_lazy_prereq test-lib-functions: test_cmp: eval $GIT_TEST_CMP t4061: use POSIX compliant regex(7)
2020-04-22Merge branch 'jk/use-quick-lookup-in-clone-for-tag-following'Junio C Hamano
The logic to auto-follow tags by "git clone --single-branch" was not careful to avoid lazy-fetching unnecessary tags, which has been corrected. * jk/use-quick-lookup-in-clone-for-tag-following: clone: use "quick" lookup while following tags
2020-04-01clone: use "quick" lookup while following tagsJeff King
When cloning with --single-branch, we implement git-fetch's usual tag-following behavior, grabbing any tag objects that point to objects we have locally. When we're a partial clone, though, our has_object_file() check will actually lazy-fetch each tag. That not only defeats the purpose of --single-branch, but it does it incredibly slowly, potentially kicking off a new fetch for each tag. This is even worse for a shallow clone, which implies --single-branch, because even tags which are supersets of each other will be fetched individually. We can fix this by passing OBJECT_INFO_SKIP_FETCH_OBJECT to the call, which is what git-fetch does in this case. Likewise, let's include OBJECT_INFO_QUICK, as that's what git-fetch does. The rationale is discussed in 5827a03545 (fetch: use "quick" has_sha1_file for tag following, 2016-10-13), but here the tradeoff would apply even more so because clone is very unlikely to be racing with another process repacking our newly-created repository. This may provide a very small speedup even in the non-partial case case, as we'd avoid calling reprepare_packed_git() for each tag (though in practice, we'd only have a single packfile, so that reprepare should be quite cheap). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-25t5616: use rev-parse instead to get HEAD's object_idĐoàn Trần Công Danh
Only HEAD's object_id is necessary, rev-list is an overkill. Despite POSIX requires grep(1) treat single pattern with <newline> as multiple patterns. busybox's grep(1) (as of v1.31.1) haven't implemented it yet. Use rev-parse to simplify the test and avoid busybox unimplemented features. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-05Merge branch 'js/ci-windows-update'Junio C Hamano
Updates to the CI settings. * js/ci-windows-update: Azure Pipeline: switch to the latest agent pools ci: prevent `perforce` from being quarantined t/lib-httpd: avoid using macOS' sed
2020-03-03Merge branch 'ds/partial-clone-fixes'Junio C Hamano
Fix for a bug revealed by a recent change to make the protocol v2 the default. * ds/partial-clone-fixes: partial-clone: avoid fetching when looking for objects partial-clone: demonstrate bugs in partial fetch
2020-02-27t/lib-httpd: avoid using macOS' sedJohannes Schindelin
Among other differences relative to GNU sed, macOS' sed always ends its output with a trailing newline, even if the input did not have such a trailing newline. Surprisingly, this makes three httpd-based tests fail on macOS: t5616, t5702 and t5703. ("Surprisingly" because those tests have been around for some time, but apparently nobody runs them on macOS with a working Apache2 setup.) The reason is that we use `sed` in those tests to filter the response of the web server. Apart from the fact that we use GNU constructs (such as using a space after the `c` command instead of a backslash and a newline), we have another problem: macOS' sed LF-only newlines while webservers are supposed to use CR/LF ones. Even worse, t5616 uses `sed` to replace a binary part of the response with a new binary part (kind of hoping that the replaced binary part does not contain a 0x0a byte which would be interpreted as a newline). To that end, it calls on Perl to read the binary pack file and hex-encode it, then calls on `sed` to prefix every hex digit pair with a `\x` in order to construct the text that the `c` statement of the `sed` invocation is supposed to insert. So we call Perl and sed to construct a sed statement. The final nail in the coffin is that macOS' sed does not even interpret those `\x<hex>` constructs. Let's just replace all of that by Perl snippets. With Perl, at least, we do not have to deal with GNU vs macOS semantics, we do not have to worry about unwanted trailing newlines, and we do not have to spawn commands to construct arguments for other commands to be spawned (i.e. we can avoid a whole lot of shell scripting complexity). The upshot is that this fixes t5616, t5702 and t5703 on macOS with Apache2. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-22partial-clone: avoid fetching when looking for objectsDerrick Stolee
When using partial clone, find_non_local_tags() in builtin/fetch.c checks each remote tag to see if its object also exists locally. There is no expectation that the object exist locally, but this function nevertheless triggers a lazy fetch if the object does not exist. This can be extremely expensive when asking for a commit, as we are completely removed from the context of the non-existent object and thus supply no "haves" in the request. 6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a global variable that prevented these fetches in favor of a bitflag. However, some object existence checks were not updated to use this flag. Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in addition to OBJECT_INFO_QUICK. The _QUICK option only prevents repreparing the pack-file structures. We need to be extremely careful about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist due to updated refs. This resolves a broken test in t5616-partial-clone.sh. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-22partial-clone: demonstrate bugs in partial fetchDerrick Stolee
While testing partial clone, I noticed some odd behavior. I was testing a way of running 'git init', followed by manually configuring the remote for partial clone, and then running 'git fetch'. Astonishingly, I saw the 'git fetch' process start asking the server for multiple rounds of pack-file downloads! When tweaking the situation a little more, I discovered that I could cause the remote to hang up with an error. Add two tests that demonstrate these two issues. In the first test, we find that when fetching with blob filters from a repository that previously did not have any tags, the 'git fetch --tags origin' command fails because the server sends "multiple filter-specs cannot be combined". This only happens when using protocol v2. In the second test, we see that a 'git fetch origin' request with several ref updates results in multiple pack-file downloads. This must be due to Git trying to fault-in the objects pointed by the refs. What makes this matter particularly nasty is that this goes through the do_oid_object_info_extended() method, so there are no "haves" in the negotiation. This leads the remote to send every reachable commit and tree from each new ref, providing a quadratic amount of data transfer! This test is fixed if we revert 6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05), but that revert causes other test failures. The real fix will need more care. The tests are ordered in this way because if I swap the test order the tag test will succeed instead of fail. I believe this is because somehow we need the srv.bare repo to not have any tags when we clone, but then have tags in our next fetch. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-27t5616: make robust to delta base changeJonathan Tan
Commit 6462d5eb9a ("fetch: remove fetch_if_missing=0", 2019-11-08) contains a test that relies on having to lazily fetch the delta base of a blob, but assumes that the tree being fetched (as part of the test) is sent as a non-delta object. This assumption may not hold in the future; for example, a change in the length of the object hash might result in the tree being sent as a delta instead. Make the test more robust by relying on having to lazily fetch the delta base of the tree instead, and by making no assumptions on whether the blobs are sent as delta or non-delta. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-01Merge branch 'jt/fetch-remove-lazy-fetch-plugging'Junio C Hamano
"git fetch" codepath had a big "do not lazily fetch missing objects when I ask if something exists" switch. This has been corrected by marking the "does this thing exist?" calls with "if not please do not lazily fetch it" flag. * jt/fetch-remove-lazy-fetch-plugging: promisor-remote: remove fetch_if_missing=0 clone: remove fetch_if_missing=0 fetch: remove fetch_if_missing=0
2019-11-08fetch: remove fetch_if_missing=0Jonathan Tan
In fetch_pack() (and all functions it calls), pass OBJECT_INFO_SKIP_FETCH_OBJECT whenever we query an object that could be a tree or blob that we do not want to be lazy-fetched even if it is absent. Thus, the only lazy-fetches occurring for trees and blobs are when resolving deltas. Thus, we can remove fetch_if_missing=0 from builtin/fetch.c. Remove this, and also add a test ensuring that such objects are not lazy-fetched. (We might be able to remove fetch_if_missing=0 from other places too, but I have limited myself to builtin/fetch.c in this commit because I have not written tests for the other commands yet.) Note that commits and tags may still be lazy-fetched. I limited myself to objects that could be trees or blobs here because Git does not support creating such commit- and tag-excluding clones yet, and even if such a clone were manually created, Git does not have good support for fetching a single commit (when fetching a commit, it and all its ancestors would be sent). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-16fetch-pack: write fetched refs to .promisorJonathan Tan
The specification of promisor packfiles (in partial-clone.txt) states that the .promisor files that accompany packfiles do not matter (just like .keep files), so whenever a packfile is fetched from the promisor remote, Git has been writing empty .promisor files. But these files could contain more useful information. So instead of writing empty files, write the refs fetched to these files. This makes it easier to debug issues with partial clones, as we can identify what refs (and their associated hashes) were fetched at the time the packfile was downloaded, and if necessary, compare those hashes against what the promisor remote reports now. This is implemented by teaching fetch-pack to write its own non-empty .promisor file whenever it knows the name of the pack's lockfile. This covers the case wherein the user runs "git fetch" with an internal protocol or HTTP protocol v2 (fetch_refs_via_pack() in transport.c sets lock_pack) and with HTTP protocol v0/v1 (fetch_git() in remote-curl.c passes "--lock-pack" to "fetch-pack"). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Acked-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07Merge branch 'jk/partial-clone-sparse-blob'Junio C Hamano
The name of the blob object that stores the filter specification for sparse cloning/fetching was interpreted in a wrong place in the code, causing Git to abort. * jk/partial-clone-sparse-blob: list-objects-filter: use empty string instead of NULL for sparse "base" list-objects-filter: give a more specific error sparse parsing error list-objects-filter: delay parsing of sparse oid t5616: test cloning/fetching with sparse:oid=<oid> filter
2019-09-18Merge branch 'md/list-objects-filter-combo'Junio C Hamano
The list-objects-filter API (used to create a sparse/lazy clone) learned to take a combined filter specification. * md/list-objects-filter-combo: list-objects-filter-options: make parser void list-objects-filter-options: clean up use of ALLOC_GROW list-objects-filter-options: allow mult. --filter strbuf: give URL-encoding API a char predicate fn list-objects-filter-options: make filter_spec a string_list list-objects-filter-options: move error check up list-objects-filter: implement composite filters list-objects-filter-options: always supply *errbuf list-objects-filter: put omits set in filter struct list-objects-filter: encapsulate filter components
2019-09-18Merge branch 'cc/multi-promisor'Junio C Hamano
Teach the lazy clone machinery that there can be more than one promisor remote and consult them in order when downloading missing objects on demand. * cc/multi-promisor: Move core_partial_clone_filter_default to promisor-remote.c Move repository_format_partial_clone to promisor-remote.c Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} remote: add promisor and partial clone config to the doc partial-clone: add multiple remotes in the doc t0410: test fetching from many promisor remotes builtin/fetch: remove unique promisor remote limitation promisor-remote: parse remote.*.partialclonefilter Use promisor_remote_get_direct() and has_promisor_remote() promisor-remote: use repository_format_partial_clone promisor-remote: add promisor_remote_reinit() promisor-remote: implement promisor_remote_get_direct() Add initial support for many promisor remotes fetch-object: make functions return an error code t0410: remove pipes after git commands
2019-09-16list-objects-filter: give a more specific error sparse parsing errorJon Simons
The sparse:oid filter has two error modes: we might fail to resolve the name to an OID, or we might fail to parse the contents of that OID. In the latter case, let's give a less generic error message, and mention the OID we did find. While we're here, let's also mark both messages as translatable. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>