From af6a51875a44c0bd16a21329456aadde615a2f46 Mon Sep 17 00:00:00 2001 From: Elijah Newren Date: Fri, 14 Jan 2022 15:59:41 +0000 Subject: repo_read_index: clear SKIP_WORKTREE bit from files present in worktree The fix is short (~30 lines), but the description is not. Sorry. There is a set of problems caused by files in what I'll refer to as the "present-despite-SKIP_WORKTREE" state. This commit aims to not just fix these problems, but remove the entire class as a possibility -- for those using sparse checkouts. But first, we need to understand the problems this class presents. A quick outline: * Problems * User facing issues * Problem space complexity * Maintenance and code correctness challenges * SKIP_WORKTREE expectations in Git * Suggested solution * Pros/Cons of suggested solution * Notes on testcase modifications === User facing issues === There are various ways for users to get files to be present in the working copy despite having the SKIP_WORKTREE bit set for that file in the index. This may come from: * various git commands not really supporting the SKIP_WORKTREE bit[1,2] * users grabbing files from elsewhere and writing them to the worktree (perhaps even cached in their editor) * users attempting to "abort" a sparse-checkout operation with a not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the working tree is not atomic)[3]. Once users have present-despite-SKIP_WORKTREE files, any modifications users make to these files will be ignored, possibly to users' confusion. Further: * these files will degrade performance for the sparse-index case due to requiring the index to be expanded (see commit 55dfcf9591 ("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why we try to delete entire directories outside the sparse cone). * these files will not be updated by by standard commands (switch/checkout/pull/merge/rebase will leave them alone unless conflicts happen -- and even then, the conflicted file may be written somewhere else to avoid overwriting the SKIP_WORKTREE file that is present and in the way) * there is nothing in Git that users can use to discover such files (status, diff, grep, etc. all ignore it) * there is no reasonable mechanism to "recover" from such a condition (neither `git sparse-checkout reapply` nor `git reset --hard` will correct it). So, not only are users modifications ignored, but the files get progressively more stale over time. At some point in the future, they may change their sparseness specification or disable sparse-checkouts. At that time, all present-despite-SKIP_WORKTREE files will show up as having lots of modifications because they represent a version from a different branch or commit. These might include user-made local changes from days before, but the only way to tell is to have users look through them all closely. If these users come to others for help, there will be no logs that explain the issue; it's just a mysterious list of changes. Users might adamantly claim (correctly, as it turns out) that they didn't modify these files, while others presume they did. [1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/ [2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/ [3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/ === Problem space complexity === SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of work on it initially, and several others have since come along and put lots of work into it. Stolee spent most of 2021 on the sparse-index, with lots of bugfixes along the way including to non-sparse-index cases as we are still trying to get sparse checkouts to behave reasonably. Basically every codepath throughout the treat needs to be aware of an additional type of file: tracked-but-not-present. The extra type results in lots of extra testcases and lots of extra code everywhere. But, the sad thing is that we actually have more than one extra type. We have tracked, tracked-but-not-present (SKIP_WORKTREE), and tracked-but-promised-to-not-be-present-but-is-present-anyway (present-despite-SKIP_WORKTREE). Two types is a monumental amount of effort to support, and adding a third feels a bit like insanity[4]. [4] Some examples of which can be seen at https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/ === Maintenance and code correctness challenges === Matheus' patches to grep stalled for nearly a year, in part because of complications of how to handle sparse-checkouts appropriately in all cases[5][6] (with trying to sanely figure out how to sanely handle present-despite-SKIP_WORKTREE files being one of the complications). His rm/add follow-ups also took months because of those kinds of issues[7]. The corner cases with things like submodules and SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start becoming really complex[8]. We've had to add ugly logic to merge-ort to attempt to handle present-despite-SKIP_WORKTREE files[9], and basically just been forced to give up in merge-recursive knowing full well that we'll sometimes silently discard user modifications. Despite stash essentially being a merge, it needed extra code (beyond what was in merge-ort and merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid a few different bugs that'd result in an early abort with a partial stash application[10]. [5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t and the dates on the thread; also Matheus and I had several conversations off-list trying to resolve the issues over that time [6] ...it finally kind of got unstuck after https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/ [7] See for example https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t and quotes like "The core functionality of sparse-checkout has always been only partially implemented", a statement I still believe is true today. [8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/ [9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries", 2021-03-20) [10] See commit ba359fd507 ("stash: fix stash application in sparse-checkouts", 2020-12-01) === SKIP_WORKTREE expectations in Git === A couple quotes: * From [11] (before the "sparse-checkout" command existed): If it needs too many special cases, hacks, and conditionals, then it is not worth the complexity---if it is easier to write a correct code by allowing Git to populate working tree files, it is perfectly fine to do so. In a sense, the sparse checkout "feature" itself is a hack by itself, and that is why I think this part should be "best effort" as well. * From the git-sparse-checkout manual (still present today): THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN THE FUTURE. [11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/ === Suggested solution === SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as the name of the option implies, to allow the file to NOT be in the worktree but consider it to be unchanged rather than deleted. The suggests a simple solution: present-despite-SKIP_WORKTREE files should not exist, for those using sparse-checkouts. Enforce this at index loading time by checking if core.sparseCheckout is true; if so, check files in the index with the SKIP_WORKTREE bit set to verify that they are absent from the working tree. If they are present, unset the bit (in memory, though any commands that write to the index will record the update). Users can, of course, can get the SKIP_WORKTREE bit back such as by running `git sparse-checkout reapply` (if they have ensured the file is unmodified and doesn't match the specified sparsity patterns). === Pros/Cons of suggested solution === Pros: * Solves the user visible problems reported above, which I've been complaining about for nearly a year but couldn't find a solution to. * Helps prevent slow performance degradation with a sparse-index. * Much easier behavior in sparse-checkouts for users to reason about * Very simple, ~30 lines of code. * Significantly simplifies some ugly testcases, and obviates the need to test an entire class of potential issues. * Reduces code complexity, reasoning, and maintenance. Avoids disagreements about weird corner cases[12]. * It has been reported that some users might be (ab)using SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree mechanism[13, and a few other similar references]. These users know of multiple caveats and shortcomings in doing so; perhaps not surprising given the "SKIP_WORKTREE expecations" section above. However, these users use `git update-index --skip-worktree`, and not `git sparse-checkout` or core.sparseCheckout=true. As such, these users would be unaffected by this change and can continue abusing the system as before. [12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/ [13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree Cons: * When core.sparseCheckout is enabled, this adds a performance cost to reading the index. I'll defer discussion of this cost to a subsequent patch, since I have some optimizations to add. === Notes on testcase modifications === The good: * t1011: Compare to two cases above it ('read-tree will not throw away dirty changes, non-sparse'); since the file is present, it should match the non-sparse case now * t1092: sparse-index & sparse-checkout now match full-worktree behavior in more cases! Yaay for consistency! * t6428, t7012: look at how much simpler the tests become! Merge and stash can just fail early telling the user there's a file in the way, instead of not noticing until it's about to write a file and then have to implement sudden crash avoidance. Hurray for sanity! * t7817: sparse behavior better matches full tree behavior. Hurray for sanity! The confusing: * t3705: These changes were ONLY needed on Windows, but they don't hurt other platforms. Let's discuss each individually: * core.sparseCheckout should be false by default. Nothing in this testcase toggles that until many, many tests later. However, early tests (#5 in particular) were testing `update-index --skip-worktree` behavior in a non-sparse-checkout, but the Windows tests in CI were behaving as if core.sparseCheckout=true had been specified somewhere. I do not have access to a Windows machine. But I just manually did what should have been a no-op and turned the config off. And it fixed the test. * I have no idea why the leftover .gitattributes file from this test was causing failures for test #18 on Windows, but only with these changes of mine. Test #18 was checking for empty stderr, and specifically wanted to know that some error completely unrelated to file endings did not appear. The leftover .gitattributes file thus caused some spurious stderr unrelated to the thing being checked. Since other tests did not intend to test normalization, just proactively remove the .gitattributes file. I'm certain this is cleaner and better, I'm just unsure why/how this didn't trigger problems before. Signed-off-by: Elijah Newren Signed-off-by: Junio C Hamano --- sparse-index.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) (limited to 'sparse-index.c') diff --git a/sparse-index.c b/sparse-index.c index a1d505d50e..b82648b10e 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -341,6 +341,27 @@ void ensure_correct_sparsity(struct index_state *istate) ensure_full_index(istate); } +void clear_skip_worktree_from_present_files(struct index_state *istate) +{ + int i; + if (!core_apply_sparse_checkout) + return; + +restart: + for (i = 0; i < istate->cache_nr; i++) { + struct cache_entry *ce = istate->cache[i]; + struct stat st; + + if (ce_skip_worktree(ce) && !lstat(ce->name, &st)) { + if (S_ISSPARSEDIR(ce->ce_mode)) { + ensure_full_index(istate); + goto restart; + } + ce->ce_flags &= ~CE_SKIP_WORKTREE; + } + } +} + /* * This static global helps avoid infinite recursion between * expand_to_path() and index_file_exists(). -- cgit v1.2.3 From d79d29935262f8422a6e620bd33cc2fce28e669d Mon Sep 17 00:00:00 2001 From: Elijah Newren Date: Fri, 14 Jan 2022 15:59:43 +0000 Subject: Accelerate clear_skip_worktree_from_present_files() by caching Trying to clear the skip-worktree bit from files that are present does present some computational overhead, for sparse-checkouts. (We do not do the bit clearing in non-sparse-checkouts.) Optimize it as follows: Rather than lstat()'ing every SKIP_WORKTREE path, take advantage of the fact that entire directories will often be missing, especially for cone mode and even more so ever since commit 55dfcf9591 ("sparse-checkout: clear tracked sparse dirs", 2021-09-08). If we have already determined that the parent directory of a file (or other previous ancestor) does not exist, then the file cannot exist either so we do not need to lstat() it separately. Timings for p2000 included below, reformatted to fit in normal commit message line lengths, which compare three things: * Timings before this series * Timings of the unoptimized version of clear_skip_worktree_from_present_files() from a few commits ago * Timings after the optimization in this commit (NOTE: t/perf/ appears to have timing resolution only down to 0.01 s, which presents significant measurement error when timings only differ by 0.01s. I don't trust any such timings below, and yet all the optimized results differ by at most 0.01s.) Test Before Series Unoptimized Optimized ----------------------------------------------------------------------------- *git status* full-v3 0.15(0.10+0.06) 0.32(0.16+0.17) +113.3% 0.16(0.10+0.07) +6.7% full-v4 0.15(0.11+0.05) 0.32(0.17+0.16) +113.3% 0.16(0.11+0.05) +6.7% sparse-v3 0.04(0.03+0.04) 0.04(0.02+0.05) +0.0% 0.04(0.02+0.05) +0.0% sparse-v4 0.04(0.03+0.04) 0.04(0.02+0.05) +0.0% 0.04(0.03+0.05) +0.0% *git add -A* full-v3 0.40(0.30+0.07) 0.56(0.36+0.17) +40.0% 0.39(0.30+0.07) -2.5% full-v4 0.37(0.28+0.07) 0.54(0.37+0.16) +45.9% 0.38(0.29+0.07) +2.7% sparse-v3 0.06(0.04+0.05) 0.08(0.05+0.05) +33.3% 0.06(0.05+0.04) +0.0% sparse-v4 0.05(0.03+0.05) 0.05(0.04+0.04) +0.0% 0.06(0.04+0.05) +20.0% *git add .* full-v3 0.40(0.31+0.07) 0.57(0.37+0.17) +42.5% 0.41(0.30+0.08) +2.5% full-v4 0.38(0.30+0.06) 0.55(0.37+0.16) +44.7% 0.38(0.30+0.06) +0.0% sparse-v3 0.06(0.04+0.05) 0.06(0.05+0.04) +0.0% 0.06(0.03+0.05) +0.0% sparse-v4 0.06(0.05+0.05) 0.06(0.04+0.05) +0.0% 0.06(0.04+0.06) +0.0% *git commit -a -m A* full-v3 0.41(0.32+0.06) 0.58(0.39+0.17) +41.5% 0.42(0.32+0.07) +2.4% full-v4 0.39(0.30+0.07) 0.56(0.38+0.17) +43.6% 0.40(0.31+0.07) +2.6% sparse-v3 0.04(0.03+0.04) 0.04(0.03+0.04) +0.0% 0.04(0.03+0.04) +0.0% sparse-v4 0.04(0.03+0.05) 0.04(0.03+0.05) +0.0% 0.04(0.03+0.04) +0.0% *git checkout -f -* full-v3 0.56(0.46+0.07) 0.73(0.55+0.16) +30.4% 0.57(0.47+0.08) +1.8% full-v4 0.54(0.45+0.07) 0.71(0.53+0.17) +31.5% 0.55(0.45+0.07) +1.9% sparse-v3 0.06(0.04+0.04) 0.06(0.04+0.05) +0.0% 0.06(0.04+0.05) +0.0% sparse-v4 0.05(0.05+0.04) 0.05(0.04+0.05) +0.0% 0.06(0.04+0.05) +20.0% *git reset* full-v3 0.34(0.26+0.05) 0.51(0.34+0.15) +50.0% 0.34(0.26+0.06) +0.0% full-v4 0.32(0.24+0.06) 0.49(0.32+0.15) +53.1% 0.33(0.25+0.06) +3.1% sparse-v3 0.04(0.03+0.04) 0.04(0.03+0.04) +0.0% 0.04(0.03+0.04) +0.0% sparse-v4 0.03(0.03+0.04) 0.03(0.02+0.04) +0.0% 0.03(0.03+0.04) +0.0% *git reset --hard* full-v3 0.57(0.46+0.07) 0.90(0.61+0.25) +57.9% 0.57(0.45+0.08) +0.0% full-v4 0.54(0.46+0.05) 0.88(0.59+0.26) +63.0% 0.55(0.45+0.07) +1.9% sparse-v3 0.07(0.03+0.03) 0.07(0.04+0.03) +0.0% 0.07(0.03+0.03) +0.0% sparse-v4 0.06(0.03+0.03) 0.06(0.04+0.02) +0.0% 0.06(0.03+0.03) +0.0% *git reset -- does-not-exist* full-v3 0.35(0.27+0.06) 0.52(0.32+0.17) +48.6% 0.35(0.27+0.06) +0.0% full-v4 0.33(0.26+0.05) 0.50(0.33+0.15) +51.5% 0.33(0.26+0.06) +0.0% sparse-v3 0.04(0.03+0.04) 0.04(0.03+0.04) +0.0% 0.04(0.03+0.04) +0.0% sparse-v4 0.04(0.02+0.04) 0.03(0.02+0.04) -25.0% 0.03(0.02+0.04) -25.0% *git diff* full-v3 0.07(0.04+0.04) 0.24(0.11+0.14) +242.9% 0.07(0.04+0.04) +0.0% full-v4 0.07(0.03+0.05) 0.24(0.13+0.12) +242.9% 0.08(0.04+0.05) +14.3% sparse-v3 0.02(0.01+0.04) 0.02(0.01+0.04) +0.0% 0.02(0.01+0.05) +0.0% sparse-v4 0.02(0.02+0.03) 0.02(0.01+0.04) +0.0% 0.02(0.01+0.04) +0.0% *git diff --cached* full-v3 0.05(0.03+0.02) 0.22(0.12+0.09) +340.0% 0.05(0.03+0.01) +0.0% full-v4 0.05(0.03+0.01) 0.23(0.12+0.11) +360.0% 0.05(0.03+0.02) +0.0% sparse-v3 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0% 0.01(0.00+0.00) +0.0% sparse-v4 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0% 0.01(0.00+0.00) +0.0% *git blame f2/f4/a* full-v3 0.18(0.13+0.05) 0.52(0.29+0.23) +188.9% 0.19(0.15+0.04) +5.6% full-v4 0.19(0.15+0.04) 0.52(0.28+0.23) +173.7% 0.19(0.14+0.04) +0.0% sparse-v3 0.10(0.08+0.02) 0.10(0.09+0.01) +0.0% 0.10(0.09+0.01) +0.0% sparse-v4 0.10(0.08+0.02) 0.10(0.08+0.02) +0.0% 0.10(0.08+0.02) +0.0% *git blame f2/f4/f3/a* full-v3 0.45(0.36+0.08) 0.78(0.51+0.27) +73.3% 0.45(0.37+0.08) +0.0% full-v4 0.45(0.37+0.08) 0.78(0.51+0.26) +73.3% 0.45(0.37+0.08) +0.0% sparse-v3 0.36(0.32+0.04) 0.36(0.31+0.05) +0.0% 0.36(0.31+0.04) +0.0% sparse-v4 0.36(0.31+0.05) 0.36(0.31+0.05) +0.0% 0.36(0.31+0.04) +0.0% *git checkout-index -f --all* full-v3 0.07(0.02+0.05) 0.24(0.12+0.12) +242.9% 0.08(0.04+0.04) +14.3% full-v4 0.07(0.03+0.04) 0.24(0.11+0.13) +242.9% 0.08(0.03+0.04) +14.3% sparse-v3 0.04(0.01+0.03) 0.04(0.00+0.03) +0.0% 0.04(0.01+0.03) +0.0% sparse-v4 0.04(0.01+0.02) 0.04(0.01+0.03) +0.0% 0.04(0.01+0.02) +0.0% *git update-index --add --remove f2/f4/a* full-v3 0.29(0.23+0.02) 0.46(0.30+0.12) +58.6% 0.30(0.24+0.02) +3.4% full-v4 0.27(0.22+0.02) 0.45(0.29+0.12) +66.7% 0.28(0.22+0.03) +3.7% sparse-v3 0.02(0.02+0.00) 0.02(0.01+0.00) +0.0% 0.02(0.01+0.00) +0.0% sparse-v4 0.02(0.02+0.00) 0.02(0.02+0.00) +0.0% 0.02(0.02+0.00) +0.0% So, with the optimization, the extra work appears to be essentially 0 for sparse-checkouts that are also using sparse-indexes (even before my optimization), and the extra work appears to be just marginally more than 0 for sparse-checkouts that are using full indexes. Signed-off-by: Elijah Newren Signed-off-by: Junio C Hamano --- sparse-index.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 54 insertions(+), 2 deletions(-) (limited to 'sparse-index.c') diff --git a/sparse-index.c b/sparse-index.c index b82648b10e..eed170cd8f 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -341,18 +341,70 @@ void ensure_correct_sparsity(struct index_state *istate) ensure_full_index(istate); } +static int path_found(const char *path, const char **dirname, size_t *dir_len, + int *dir_found) +{ + struct stat st; + char *newdir; + char *tmp; + + /* + * If dirname corresponds to a directory that doesn't exist, and this + * path starts with dirname, then path can't exist. + */ + if (!*dir_found && !memcmp(path, *dirname, *dir_len)) + return 0; + + /* + * If path itself exists, return 1. + */ + if (!lstat(path, &st)) + return 1; + + /* + * Otherwise, path does not exist so we'll return 0...but we'll first + * determine some info about its parent directory so we can avoid + * lstat calls for future cache entries. + */ + newdir = strrchr(path, '/'); + if (!newdir) + return 0; /* Didn't find a parent dir; just return 0 now. */ + + /* + * If path starts with directory (which we already lstat'ed and found), + * then no need to lstat parent directory again. + */ + if (*dir_found && *dirname && memcmp(path, *dirname, *dir_len)) + return 0; + + /* Free previous dirname, and cache path's dirname */ + *dirname = path; + *dir_len = newdir - path + 1; + + tmp = xstrndup(path, *dir_len); + *dir_found = !lstat(tmp, &st); + free(tmp); + + return 0; +} + void clear_skip_worktree_from_present_files(struct index_state *istate) { + const char *last_dirname = NULL; + size_t dir_len = 0; + int dir_found = 1; + int i; + if (!core_apply_sparse_checkout) return; restart: for (i = 0; i < istate->cache_nr; i++) { struct cache_entry *ce = istate->cache[i]; - struct stat st; - if (ce_skip_worktree(ce) && !lstat(ce->name, &st)) { + if (ce_skip_worktree(ce) && + path_found(ce->name, &last_dirname, &dir_len, &dir_found)) { if (S_ISSPARSEDIR(ce->ce_mode)) { ensure_full_index(istate); goto restart; -- cgit v1.2.3 From ecc7c8841de40a38637680509f7e1e6ce25d89ca Mon Sep 17 00:00:00 2001 From: Elijah Newren Date: Fri, 25 Feb 2022 22:12:22 -0800 Subject: repo_read_index: add config to expect files outside sparse patterns Typically with sparse checkouts, we expect files outside the sparsity patterns to be marked as SKIP_WORKTREE and be missing from the working tree. Sometimes this expectation would be violated however; including in cases such as: * users grabbing files from elsewhere and writing them to the worktree (perhaps by editing a cached copy in an editor, copying/renaming, or even untarring) * various git commands having incomplete or no support for the SKIP_WORKTREE bit[1,2] * users attempting to "abort" a sparse-checkout operation with a not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the working tree is not atomic)[3]. When the SKIP_WORKTREE bit in the index did not reflect the presence of the file in the working tree, it traditionally caused confusion and was difficult to detect and recover from. So, in a sparse checkout, since af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files present in worktree, 2022-01-14), Git automatically clears the SKIP_WORKTREE bit at index read time for entries corresponding to files that are present in the working tree. There is another workflow, however, where it is expected that paths outside the sparsity patterns appear to exist in the working tree and that they do not lose the SKIP_WORKTREE bit, at least until they get modified. A Git-aware virtual file system[4] takes advantage of its position as a file system driver to expose all files in the working tree, fetch them on demand using partial clone on access, and tell Git to pay attention to them on demand by updating the sparse checkout pattern on writes. This means that commands like "git status" only have to examine files that have potentially been modified, whereas commands like "ls" are able to show the entire codebase without requiring manual updates to the sparse checkout pattern. Thus since af6a51875a, Git with such Git-aware virtual file systems unsets the SKIP_WORKTREE bit for all files and commands like "git status" have to fetch and examine them all. Introduce a configuration setting sparse.expectFilesOutsideOfPatterns to allow limiting the tracked set of files to a small set once again. A Git-aware virtual file system or other application that wants to maintain files outside of the sparse checkout can set this in a repository to instruct Git not to check for the presence of SKIP_WORKTREE files. The setting defaults to false, so most users of sparse checkout will still get the benefit of an automatically updating index to recover from the variety of difficult issues detailed in af6a51875a for paths with SKIP_WORKTREE set despite the path being present. [1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/ [2] The three long paragraphs in the middle of https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/ [3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/ [4] such as the vfsd described in https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@google.com/ Signed-off-by: Jonathan Nieder Signed-off-by: Elijah Newren Reviewed-by: Jonathan Nieder Signed-off-by: Junio C Hamano --- sparse-index.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'sparse-index.c') diff --git a/sparse-index.c b/sparse-index.c index eed170cd8f..daeb5112a1 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -396,7 +396,8 @@ void clear_skip_worktree_from_present_files(struct index_state *istate) int i; - if (!core_apply_sparse_checkout) + if (!core_apply_sparse_checkout || + sparse_expect_files_outside_of_patterns) return; restart: -- cgit v1.2.3