Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-12-03t0410: mark tests to require the reffiles backendPatrick Steinhardt
Two of our tests in t0410 verify whether partial clones end up with the correct repository format version and extensions. These checks require the reffiles backend because every other backend would by necessity bump the repository format version to be at least 1. Mark the tests accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05promisor-remote: die upon failing fetchJonathan Tan
In a partial clone, an attempt to read a missing object results in an attempt to fetch that single object. In order to avoid multiple sequential fetches, which would occur when multiple objects are missing (which is the typical case), some commands have been taught to prefetch in a batch: such a command would, in a partial clone, notice that several objects that it will eventually need are missing, and call promisor_remote_get_direct() with all such objects at once. When this batch prefetch fails, these commands fall back to the sequential fetches. But at $DAYJOB we have noticed that this results in a bad user experience: a command would take unexpectedly long to finish (and possibly use up a lot of bandwidth) if the batch prefetch would fail for some intermittent reason, but all subsequent fetches would work. It would be a better user experience for such a command would just fail. Therefore, make it a fatal error if the prefetch fails and at least one object being fetched is known to be a promisor object. (The latter criterion is to make sure that we are not misleading the user that such an object would be present from the promisor remote. For example, a missing object may be a result of repository corruption and not because it is expectedly missing due to the repository being a partial clone.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-16partial-clone: add a partial-clone test caseAbhradeep Chakraborty
In a blobless-cloned repo, `git log --follow -- <path>` (`<path>` have an exact OID rename) shouldn't download blob of the file from where the new file is renamed. Add a test case to verify it. Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13t0000-t3999: detect and signal failure within loopEric Sunshine
Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `|| return 1` (or `|| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-16fsck: verify commit graph when implicitly enabledGlen Choo
Change fsck to check the "core_commit_graph" variable set in "repo-settings.c" instead of reading the "core.commitGraph" variable. This fixes a bug where we wouldn't verify the commit-graph if the config key was missing. This bug was introduced in 31b1de6a09 (commit-graph: turn on commit-graph by default, 2019-08-13), where core.commitGraph was turned on by default. Add tests to "t5318-commit-graph.sh" to verify that fsck checks the commit-graph as expected for the 3 values of core.commitGraph. Also, disable GIT_TEST_COMMIT_GRAPH in t/t0410-partial-clone.sh because some test cases use fsck in ways that assume that commit-graph checking is disabled. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAPJeff King
Generating a MIDX bitmap causes tests which repack in a partial clone to fail because they are missing objects. Missing objects is an expected component of tests in t0410, so disable this knob altogether. Graceful degradation when writing a bitmap with missing objects is tested in t5326. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24pack-bitmap-write.c: gracefully fail to write non-closed bitmapsTaylor Blau
The set of objects covered by a bitmap must be closed under reachability, since it must be the case that there is a valid bit position assigned for every possible reachable object (otherwise the bitmaps would be incomplete). Pack bitmaps are never written from 'git repack' unless repacking all-into-one, and so we never write non-closed bitmaps (except in the case of partial clones where we aren't guaranteed to have all objects). But multi-pack bitmaps change this, since it isn't known whether the set of objects in the MIDX is closed under reachability until walking them. Plumb through a bit that is set when a reachable object isn't found. As soon as a reachable object isn't found in the set of objects to include in the bitmap, bitmap_writer_build() knows that the set is not closed, and so it now fails gracefully. A test is added in t0410 to trigger a bitmap write without full reachability closure by removing local copies of some reachable objects from a promisor remote. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28promisor-remote: teach lazy-fetch in any repoJonathan Tan
This is one step towards supporting partial clone submodules. Even after this patch, we will still lack partial clone submodules support, primarily because a lot of Git code that accesses submodule objects does so by adding their object stores as alternates, meaning that any lazy fetches that would occur in the submodule would be done based on the config of the superproject, not of the submodule. This also prevents testing of the functionality in this patch by user-facing commands. So for now, test this mechanism using a test helper. Besides that, there is some code that uses the wrapper functions like has_promisor_remote(). Those will need to be checked to see if they could support the non-wrapper functions instead (and thus support any repository, not just the_repository). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-03fetch: no FETCH_HEAD display if --no-write-fetch-headJonathan Tan
887952b8c6 ("fetch: optionally allow disabling FETCH_HEAD update", 2020-08-18) introduced the ability to disable writing to FETCH_HEAD during fetch, but did not suppress the "<source> -> FETCH_HEAD" message when this ability is used. This message is misleading in this case, because FETCH_HEAD is not written. Also, because "fetch" is used to lazy-fetch missing objects in a partial clone, this significantly clutters up the output in that case since the objects to be fetched are potentially numerous. Therefore, suppress this message when --no-write-fetch-head is passed (but not when --dry-run is set). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-19promisor-remote: lazy-fetch objects in subprocessJonathan Tan
Teach Git to lazy-fetch missing objects in a subprocess instead of doing it in-process. This allows any fatal errors that occur during the fetch to be isolated and converted into an error return value, instead of causing the current command being run to terminate. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30t0410: mark test with SHA1 prerequisitebrian m. carlson
These tests try to check that we behave properly if we encounter a repository with version 0 but an extension. This is a laudable goal, but the test cannot work with SHA-256, since SHA-256 repositories always have an existing extension and are never version 0. Add a SHA1 prerequisite to these tests. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16repository: allow repository format upgrade with extensionsJonathan Nieder
Now that we officially permit repository extensions in repository format v0, permit upgrading a repository with extensions from v0 to v1 as well. For example, this means a repository where the user has set "extensions.preciousObjects" can use "git fetch --filter=blob:none origin" to upgrade the repository to use v1 and the partial clone extension. To avoid mistakes, continue to forbid repository format upgrades in v0 repositories with an unrecognized extension. This way, a v0 user using a misspelled extension field gets a chance to correct the mistake before updating to the less forgiving v1 format. While we're here, make the error message for failure to upgrade the repository format a bit shorter, and present it as an error, not a warning. Reported-by: Huan Huan Chen <huanhuanchen@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16Revert "check_repository_format_gently(): refuse extensions for old ↵Jonathan Nieder
repositories" This reverts commit 14c7fa269e42df4133edd9ae7763b678ed6594cd. The core.repositoryFormatVersion field was introduced in ab9cb76f661 (Repository format version check., 2005-11-25), providing a welcome bit of forward compatibility, thanks to some welcome analysis by Martin Atukunda. The semantics are simple: a repository with core.repositoryFormatVersion set to 0 should be comprehensible by all Git implementations in active use; and Git implementations should error out early instead of trying to act on Git repositories with higher core.repositoryFormatVersion values representing new formats that they do not understand. A new repository format did not need to be defined until 00a09d57eb8 (introduce "extensions" form of core.repositoryformatversion, 2015-06-23). This provided a finer-grained extension mechanism for Git repositories. In a repository with core.repositoryFormatVersion set to 1, Git implementations can act on "extensions.*" settings that modify how a repository is interpreted. In repository format version 1, unrecognized extensions settings cause Git to error out. What happens if a user sets an extension setting but forgets to increase the repository format version to 1? The extension settings were still recognized in that case; worse, unrecognized extensions settings do *not* cause Git to error out. So combining repository format version 0 with extensions settings produces in some sense the worst of both worlds. To improve that situation, since 14c7fa269e4 (check_repository_format_gently(): refuse extensions for old repositories, 2020-06-05) Git instead ignores extensions in v0 mode. This way, v0 repositories get the historical (pre-2015) behavior and maintain compatibility with Git implementations that do not know about the v1 format. Unfortunately, users had been using this sort of configuration and this behavior change came to many as a surprise: - users of "git config --worktree" that had followed its advice to enable extensions.worktreeConfig (without also increasing the repository format version) would find their worktree configuration no longer taking effect - tools such as copybara[*] that had set extensions.partialClone in existing repositories (without also increasing the repository format version) would find that setting no longer taking effect The behavior introduced in 14c7fa269e4 might be a good behavior if we were traveling back in time to 2015, but we're far too late. For some reason I thought that it was what had been originally implemented and that it had regressed. Apologies for not doing my research when 14c7fa269e4 was under development. Let's return to the behavior we've had since 2015: always act on extensions.* settings, regardless of repository format version. While we're here, include some tests to describe the effect on the "upgrade repository version" code path. [*] https://github.com/google/copybara/commit/ca76c0b1e13c4e36448d12c2aba4a5d9d98fb6e7 Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-05check_repository_format_gently(): refuse extensions for old repositoriesXin Li
Previously, extensions were recognized regardless of repository format version.  If the user sets an undefined "extensions" value on a repository of version 0 and that value is used by a future git version, they might get an undesired result. Because all extensions now also upgrade repository versions, tightening the check would help avoid this for future extensions. Signed-off-by: Xin Li <delphij@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-05fetch: allow adding a filter after initial cloneXin Li
Retroactively adding a filter can be useful for existing shallow clones as they allow users to see earlier change histories without downloading all git objects in a regular --unshallow fetch. Without this patch, users can make a clone partial by editing the repository configuration to convert the remote into a promisor, like:   git config core.repositoryFormatVersion 1   git config extensions.partialClone origin   git fetch --unshallow --filter=blob:none origin Since the hard part of making this work is already in place and such edits can be error-prone, teach Git to perform the required configuration change automatically instead. Note that this change does not modify the existing git behavior which recognizes setting extensions.partialClone without changing repositoryFormatVersion. Signed-off-by: Xin Li <delphij@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07Merge branch 'cc/multi-promisor'Junio C Hamano
Cleanup. * cc/multi-promisor: promisor-remote: skip move_to_tail when no-op promisor-remote.h: drop extern from function declaration
2019-10-07Merge branch 'jt/cache-tree-avoid-lazy-fetch-during-merge'Junio C Hamano
The cache-tree code has been taught to be less aggressive in attempting to see if a tree object it computed already exists in the repository. * jt/cache-tree-avoid-lazy-fetch-during-merge: cache-tree: do not lazy-fetch tentative tree
2019-10-02promisor-remote: skip move_to_tail when no-opEmily Shaffer
Previously, when promisor_remote_move_to_tail() is called for a promisor_remote which is currently the final element in promisors, a cycle is created in the promisors linked list. This cycle leads to a double free later on in promisor_remote_clear() when the final element of the promisors list is removed: promisors is set to promisors->next (a no-op, as promisors->next == promisors); the previous value of promisors is free()'d; then the new value of promisors (which is equal to the previous value of promisors) is also free()'d. This double-free error was unrecoverable for the user without removing the filter or re-cloning the repo and hoping to miss this edge case. Now, when promisor_remote_move_to_tail() would be a no-op, just do a no-op. In cases of promisor_remote_move_to_tail() where r is not already at the tail of the list, it works as before. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Acked-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-18Merge branch 'cc/multi-promisor'Junio C Hamano
Teach the lazy clone machinery that there can be more than one promisor remote and consult them in order when downloading missing objects on demand. * cc/multi-promisor: Move core_partial_clone_filter_default to promisor-remote.c Move repository_format_partial_clone to promisor-remote.c Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} remote: add promisor and partial clone config to the doc partial-clone: add multiple remotes in the doc t0410: test fetching from many promisor remotes builtin/fetch: remove unique promisor remote limitation promisor-remote: parse remote.*.partialclonefilter Use promisor_remote_get_direct() and has_promisor_remote() promisor-remote: use repository_format_partial_clone promisor-remote: add promisor_remote_reinit() promisor-remote: implement promisor_remote_get_direct() Add initial support for many promisor remotes fetch-object: make functions return an error code t0410: remove pipes after git commands
2019-09-10cache-tree: do not lazy-fetch tentative treeJonathan Tan
The cache-tree datastructure is used to speed up the comparison between the HEAD and the index, and when the index is updated by a cherry-pick (for example), a tree object that would represent the paths in the index in a directory is constructed in-core, to see if such a tree object exists already in the object store. When the lazy-fetch mechanism was introduced, we converted this "does the tree exist?" check into an "if it does not, and if we lazily cloned, see if the remote has it" call by mistake. Since the whole point of this check is to repair the cache-tree by recording an already existing tree object opportunistically, we shouldn't even try to fetch one from the remote. Pass the OBJECT_INFO_SKIP_FETCH_OBJECT flag to make sure we only check for existence in the local object store without triggering the lazy fetch mechanism. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> [jc: rewritten the proposed log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-09Merge branch 'ds/feature-macros'Junio C Hamano
A mechanism to affect the default setting for a (related) group of configuration variables is introduced. * ds/feature-macros: repo-settings: create feature.experimental setting repo-settings: create feature.manyFiles setting repo-settings: parse core.untrackedCache commit-graph: turn on commit-graph by default t6501: use 'git gc' in quiet mode repo-settings: consolidate some config settings
2019-08-13commit-graph: turn on commit-graph by defaultDerrick Stolee
The commit-graph feature has seen a lot of activity in the past year or so since it was introduced. The feature is a critical performance enhancement for medium- to large-sized repos, and does not significantly hurt small repos. Change the defaults for core.commitGraph and gc.writeCommitGraph to true so users benefit from this feature by default. There are several places in the test suite where the environment variable GIT_TEST_COMMIT_GRAPH is disabled to avoid reading a commit-graph, if it exists. The config option overrides the environment, so swap these. Some GIT_TEST_COMMIT_GRAPH assignments remain, and those are to avoid writing a commit-graph when a new commit is created. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-02t: warn against adding non-httpd-specific tests after sourcing 'lib-httpd'SZEDER Gábor
We have a couple of test scripts that are not completely httpd-specific, but do run a few httpd-specific tests at the end. These test scripts source 'lib-httpd.sh' somewhere mid-script, which then skips all the rest of the test script if the dependencies for running httpd tests are not fulfilled. As the previous two patches in this series show, already on two occasions non-httpd-specific tests were appended at the end of such test scripts, and, consequently, they were skipped as well when httpd tests couldn't be run. Add a comment at the end of these test scripts to warn against adding non-httpd-specific tests at the end, in the hope that they will help prevent similar issues in the future. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-26t0410: test fetching from many promisor remotesChristian Couder
This shows that it is now possible to fetch objects from more than one promisor remote, and that fetching from a new promisor remote can configure it as one. Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-26promisor-remote: parse remote.*.partialclonefilterChristian Couder
This makes it possible to specify a different partial clone filter for each promisor remote. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-26t0410: remove pipes after git commandsChristian Couder
Let's not run a git command, especially one with "verify" in its name, upstream of a pipe, because the pipe will hide the git command's exit code. While at it, let's also avoid a useless `cat` command piping into `sed`. Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-14tests: use 'test_atexit' to stop httpdSZEDER Gábor
Use 'test_atexit' to run cleanup commands to stop httpd at the end of the test script or upon interrupt or failure, as it is shorter, simpler, and more robust than registering such cleanup commands in the trap on EXIT in the test scripts. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-19Merge branch 'sg/stress-test'Junio C Hamano
Flaky tests can now be repeatedly run under load with the "--stress" option. * sg/stress-test: test-lib: add the '--stress' option to run a test repeatedly under load test-lib-functions: introduce the 'test_set_port' helper function test-lib: set $TRASH_DIRECTORY earlier test-lib: consolidate naming of test-results paths test-lib: parse command line options earlier test-lib: parse options in a for loop to keep $@ intact test-lib: extract Bash version check for '-x' tracing test-lib: translate SIGTERM and SIGHUP to an exit
2019-01-15Merge branch 'md/list-lazy-objects-fix'Junio C Hamano
"git rev-list --exclude-promisor-objects" had to take an object that does not exist locally (and is lazily available) from the command line without barfing, but the code dereferenced NULL. * md/list-lazy-objects-fix: list-objects.c: don't segfault for missing cmdline objects
2019-01-07test-lib-functions: introduce the 'test_set_port' helper functionSZEDER Gábor
Several test scripts run daemons like 'git-daemon' or Apache, and communicate with them through TCP sockets. To have unique ports where these daemons are accessible, the ports are usually the number of the corresponding test scripts, unless the user overrides them via environment variables, and thus all those tests and test libs contain more or less the same bit of one-liner boilerplate code to find out the port. The last patch in this series will make this a bit more complicated. Factor out finding the port for a daemon into the common helper function 'test_set_port' to avoid repeating ourselves. Take special care of test scripts with "low" numbers: - Test numbers below 1024 would result in a port that's only usable as root, so set their port to '10000 + test-nr' to make sure it doesn't interfere with other tests in the test suite. This makes the hardcoded port number in 't0410-partial-clone.sh' unnecessary, remove it. - The shell's arithmetic evaluation interprets numbers with leading zeros as octal values, which means that test number below 1000 and containing the digits 8 or 9 will trigger an error. Remove all leading zeros from the test numbers to prevent this. Note that the 'git p4' tests are unlike the other tests involving daemons in that: - 'lib-git-p4.sh' doesn't use the test's number for unique port as is, but does a bit of additional arithmetic on top [1]. - The port is not overridable via an environment variable. With this patch even 'git p4' tests will use the test's number as default port, and it will be overridable via the P4DPORT environment variable. [1] Commit fc00233071 (git-p4 tests: refactor and cleanup, 2011-08-22) introduced that "unusual" unique port computation without explaining why it was necessary (as opposed to simply using the test number as is). It seems to be just unnecessary complication, and in any case that commit came way before the "test nr as unique port" got "standardized" for other daemons in commits c44132fcf3 (tests: auto-set git-daemon port, 2014-02-10), 3bb486e439 (tests: auto-set LIB_HTTPD_PORT from test name, 2014-02-10), and bf9d7df950 (t/lib-git-svn.sh: improve svnserve tests with parallel make test, 2017-12-01). Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-06list-objects.c: don't segfault for missing cmdline objectsMatthew DeVore
When a command is invoked with both --exclude-promisor-objects, --objects-edge-aggressive, and a missing object on the command line, the rev_info.cmdline array could get a NULL pointer for the value of an 'item' field. Prevent dereferencing of a NULL pointer in that situation. Properly handle --ignore-missing. If it is not passed, die when an object is missing. Otherwise, just silently ignore it. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-30Merge branch 'md/filter-trees'Junio C Hamano
The "rev-list --filter" feature learned to exclude all trees via "tree:0" filter. * md/filter-trees: list-objects: support for skipping tree traversal filter-trees: code clean-up of tests list-objects-filter: implement filter tree:0 list-objects-filter-options: do not over-strbuf_init list-objects-filter: use BUG rather than die revision: mark non-user-given objects instead rev-list: handle missing tree objects properly list-objects: always parse trees gently list-objects: refactor to process_tree_contents list-objects: store common func args in struct
2018-10-19Merge branch 'jt/non-blob-lazy-fetch'Junio C Hamano
A partial clone that is configured to lazily fetch missing objects will on-demand issue a "git fetch" request to the originating repository to fill not-yet-obtained objects. The request has been optimized for requesting a tree object (and not the leaf blob objects contained in it) by telling the originating repository that no blobs are needed. * jt/non-blob-lazy-fetch: fetch-pack: exclude blobs when lazy-fetching trees fetch-pack: avoid object flags if no_dependents
2018-10-07rev-list: handle missing tree objects properlyMatthew DeVore
Previously, we assumed only blob objects could be missing. This patch makes rev-list handle missing trees like missing blobs. The --missing=* and --exclude-promisor-objects flags now work for trees as they already do for blobs. This is demonstrated in t6112. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-04fetch-pack: exclude blobs when lazy-fetching treesJonathan Tan
A partial clone with missing trees can be obtained using "git clone --filter=tree:none <repo>". In such a repository, when a tree needs to be lazily fetched, any tree or blob it directly or indirectly references is fetched as well, regardless of whether the original command required those objects, or if the local repository already had some of them. This is because the fetch protocol, which the lazy fetch uses, does not allow clients to request that only the wanted objects be sent, which would be the ideal solution. This patch implements a partial solution: specify the "blob:none" filter, somewhat reducing the fetch payload. This change has no effect when lazily fetching blobs (due to how filters work). And if lazily fetching a commit (such repositories are difficult to construct and is not a use case we support very well, but it is possible), referenced commits and trees are still fetched - only the blobs are not fetched. The necessary code change is done in fetch_pack() instead of somewhere closer to where the "filter" instruction is written to the wire so that only one part of the code needs to be changed in order for users of all protocol versions to benefit from this optimization. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-24Merge branch 'jt/lazy-object-fetch-fix'Junio C Hamano
The code to backfill objects in lazily cloned repository did not work correctly, which has been corrected. * jt/lazy-object-fetch-fix: fetch-object: set exact_oid when fetching fetch-object: unify fetch_object[s] functions
2018-09-13fetch-object: set exact_oid when fetchingJonathan Tan
fetch_objects() currently does not set exact_oid in struct ref when invoking transport_fetch_refs(). If the server supports ref-in-want, fetch_pack() uses this field to determine whether a wanted ref should be requested as a "want-ref" line or a "want" line; without the setting of exact_oid, the wrong line will be sent. Set exact_oid, so that the correct line is sent. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29commit-graph: define GIT_TEST_COMMIT_GRAPHDerrick Stolee
The commit-graph feature is tested in isolation by t5318-commit-graph.sh and t6600-test-reach.sh, but there are many more interesting scenarios involving commit walks. Many of these scenarios are covered by the existing test suite, but we need to maintain coverage when the optional commit-graph structure is not present. To allow running the full test suite with the commit-graph present, add a new test environment variable, GIT_TEST_COMMIT_GRAPH. Similar to GIT_TEST_SPLIT_INDEX, this variable makes every Git command try to load the commit-graph when parsing commits, and writes the commit-graph file after every 'git commit' command. There are a few tests that rely on commits not existing in pack-files to trigger important events, so manually set GIT_TEST_COMMIT_GRAPH to false for the necessary commands. There is one test in t6024-recursive-merge.sh that relies on the merge-base algorithm picking one of two ambiguous merge-bases, and the commit-graph feature changes which merge-base is picked. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-09repack: repack promisor objects if -a or -A is setJonathan Tan
Currently, repack does not touch promisor packfiles at all, potentially causing the performance of repositories that have many such packfiles to drop. Therefore, repack all promisor objects if invoked with -a or -A. This is done by an additional invocation of pack-objects on all promisor objects individually given, which takes care of deduplication and allows the resulting packfiles to respect flags such as --max-pack-size. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-12list-objects: check if filter is NULL before usingJonathan Tan
In partial_clone_get_default_filter_spec(), the core_partial_clone_filter_default variable may be NULL; ensure that it is not NULL before using it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08gc: do not repack promisor packfilesJonathan Tan
Teach gc to stop traversal at promisor objects, and to leave promisor packfiles alone. This has the effect of only repacking non-promisor packfiles, and preserves the distinction between promisor packfiles and non-promisor packfiles. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08rev-list: support termination at promisor objectsJonathan Tan
Teach rev-list to support termination of an object traversal at any object from a promisor remote (whether one that the local repo also has, or one that the local repo knows about because it has another promisor object that references it). This will be used subsequently in gc and in the connectivity check used by fetch. For efficiency, if an object is referenced by a promisor object, and is in the local repo only as a non-promisor object, object traversal will not stop there. This is to avoid building the list of promisor object references. (In list-objects.c, the case where obj is NULL in process_blob() and process_tree() do not need to be changed because those happen only when there is a conflict between the expected type and the existing object. If the object doesn't exist, an object will be synthesized, which is fine.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08sha1_file: support lazily fetching missing objectsJonathan Tan
Teach sha1_file to fetch objects from the remote configured in extensions.partialclone whenever an object is requested but missing. The fetching of objects can be suppressed through a global variable. This is used by fsck and index-pack. However, by default, such fetching is not suppressed. This is meant as a temporary measure to ensure that all Git commands work in such a situation. Future patches will update some commands to either tolerate missing objects (without fetching them) or be more efficient in fetching them. In order to determine the code changes in sha1_file.c necessary, I investigated the following: (1) functions in sha1_file.c that take in a hash, without the user regarding how the object is stored (loose or packed) (2) functions in packfile.c (because I need to check callers that know about the loose/packed distinction and operate on both differently, and ensure that they can handle the concept of objects that are neither loose nor packed) (1) is handled by the modification to sha1_object_info_extended(). For (2), I looked at for_each_packed_object and others. For for_each_packed_object, the callers either already work or are fixed in this patch: - reachable - only to find recent objects - builtin/fsck - already knows about missing objects - builtin/cat-file - warning message added in this commit Callers of the other functions do not need to be changed: - parse_pack_index - http - indirectly from http_get_info_packs - find_pack_entry_one - this searches a single pack that is provided as an argument; the caller already knows (through other means) that the sought object is in a specific pack - find_sha1_pack - fast-import - appears to be an optimization to not store a file if it is already in a pack - http-walker - to search through a struct alt_base - http-push - to search through remote packs - has_sha1_pack - builtin/fsck - already knows about promisor objects - builtin/count-objects - informational purposes only (check if loose object is also packed) - builtin/prune-packed - check if object to be pruned is packed (if not, don't prune it) - revision - used to exclude packed objects if requested by user - diff - just for optimization Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05fsck: support promisor objects as CLI argumentJonathan Tan
Teach fsck to not treat missing promisor objects provided on the CLI as an error when extensions.partialclone is set. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05fsck: support referenced promisor objectsJonathan Tan
Teach fsck to not treat missing promisor objects indirectly pointed to by refs as an error when extensions.partialclone is set. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05fsck: support refs pointing to promisor objectsJonathan Tan
Teach fsck to not treat refs referring to missing promisor objects as an error when extensions.partialclone is set. For the purposes of warning about no default refs, such refs are still treated as legitimate refs. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05fsck: introduce partialclone extensionJonathan Tan
Currently, Git does not support repos with very large numbers of objects or repos that wish to minimize manipulation of certain blobs (for example, because they are very large) very well, even if the user operates mostly on part of the repo, because Git is designed on the assumption that every referenced object is available somewhere in the repo storage. In such an arrangement, the full set of objects is usually available in remote storage, ready to be lazily downloaded. Teach fsck about the new state of affairs. In this commit, teach fsck that missing promisor objects referenced from the reflog are not an error case; in future commits, fsck will be taught about other cases. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>