Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-25t7701: make annotated tag unreachableTaylor Blau
In 4dc16e2cb0 (gc: introduce `gc.recentObjectsHook`, 2023-06-07), we added tests to ensure that prune-able (i.e. unreachable and with mtime older than the cutoff) objects which are marked as recent via the new `gc.recentObjectsHook` configuration are unpacked as loose with `--unpack-unreachable`. In that test, we also ensure that objects which are reachable from other unreachable objects which were *not* pruned are kept as well, regardless of their mtimes. For this, we use an annotated tag pointing at a blob ($obj2) which would otherwise be pruned. But after pruning, that object is kept around for two reasons. One, the tag object's mtime wasn't adjusted to be beyond the 1-hour cutoff, so it would be kept as due to its recency regardless. The other reason is because the tag itself is reachable. Use mktag to write the tag object directly without pointing a reference at it, and adjust the mtime of the tag object to be older than the cutoff to ensure that our `gc.recentObjectsHook` configuration is working as intended. Noticed-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-13gc: introduce `gc.recentObjectsHook`Taylor Blau
This patch introduces a new multi-valued configuration option, `gc.recentObjectsHook` as a means to mark certain objects as recent (and thus exempt from garbage collection), regardless of their age. When performing a garbage collection operation on a repository with unreachable objects, Git makes its decision on what to do with those object(s) based on how recent the objects are or not. Generally speaking, unreachable-but-recent objects stay in the repository, and older objects are discarded. However, we have no convenient way to keep certain precious, unreachable objects around in the repository, even if they have aged out and would be pruned. Our options today consist of: - Point references at the reachability tips of any objects you consider precious, which may be undesirable or infeasible if there are many such objects. - Track them via the reflog, which may be undesirable since the reflog's lifetime is limited to that of the reference it's tracking (and callers may want to keep those unreachable objects around for longer). - Extend the grace period, which may keep around other objects that the caller *does* want to discard. - Manually modify the mtimes of objects you want to keep. If those objects are already loose, this is easy enough to do (you can just enumerate and `touch -m` each one). But if they are packed, you will either end up modifying the mtimes of *all* objects in that pack, or be forced to write out a loose copy of that object, both of which may be undesirable. Even worse, if they are in a cruft pack, that requires modifying its `*.mtimes` file by hand, since there is no exposed plumbing for this. - Force the caller to construct the pack of objects they want to keep themselves, and then mark the pack as kept by adding a ".keep" file. This works, but is burdensome for the caller, and having extra packs is awkward as you roll forward your cruft pack. This patch introduces a new option to the above list via the `gc.recentObjectsHook` configuration, which allows the caller to specify a program (or set of programs) whose output is treated as a set of objects to treat as recent, regardless of their true age. The implementation is straightforward. Git enumerates recent objects via `add_unseen_recent_objects_to_traversal()`, which enumerates loose and packed objects, and eventually calls add_recent_object() on any objects for which `want_recent_object()`'s conditions are met. This patch modifies the recency condition from simply "is the mtime of this object more recent than the cutoff?" to "[...] or, is this object mentioned by at least one `gc.recentObjectsHook`?". Depending on whether or not we are generating a cruft pack, this allows the caller to do one of two things: - If generating a cruft pack, the caller is able to retain additional objects via the cruft pack, even if they would have otherwise been pruned due to their age. - If not generating a cruft pack, the caller is likewise able to retain additional objects as loose. A potential alternative here is to introduce a new mode to alter the contents of the reachable pack instead of the cruft one. One could imagine a new option to `pack-objects`, say `--extra-reachable-tips` that does the same thing as above, adding the visited set of objects along the traversal to the pack. But this has the unfortunate side-effect of altering the reachability closure of that pack. If parts of the unreachable object graph mentioned by one or more of the "extra reachable tips" programs is not closed, then the resulting pack won't be either. This makes it impossible in the general case to write out reachability bitmaps for that pack, since closure is a requirement there. Instead, keep these unreachable objects in the cruft pack (or set of unreachable, loose objects) instead, to ensure that we can continue to have a pack containing just reachable objects, which is always safe to write a bitmap over. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-07tests: mark tests as passing with SANITIZE=leakÆvar Arnfjörð Bjarmason
When the "ab/various-leak-fixes" topic was merged in [1] only t6021 would fail if the tests were run in the "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode, i.e. to check whether we marked all leak-free tests with "TEST_PASSES_SANITIZE_LEAK=true". Since then we've had various tests starting to pass under SANITIZE=leak. Let's mark those as passing, this is when they started to pass, narrowed down with "git bisect": - t5317-pack-objects-filter-objects.sh: In faebba436e6 (list-objects-filter: plug pattern_list leak, 2022-12-01). - t3210-pack-refs.sh, t5613-info-alternate.sh, t7403-submodule-sync.sh: In 189e97bc4ba (diff: remove parseopts member from struct diff_options, 2022-12-01). - t1408-packed-refs.sh: In ab91f6b7c42 (Merge branch 'rs/diff-parseopts', 2022-12-19). - t0023-crlf-am.sh, t4152-am-subjects.sh, t4254-am-corrupt.sh, t4256-am-format-flowed.sh, t4257-am-interactive.sh, t5403-post-checkout-hook.sh: In a658e881c13 (am: don't pass strvec to apply_parse_options(), 2022-12-13) - t1301-shared-repo.sh, t1302-repo-version.sh: In b07a819c05f (reflog: clear leftovers in reflog_expiry_cleanup(), 2022-12-13). - t1304-default-acl.sh, t1410-reflog.sh, t5330-no-lazy-fetch-with-commit-graph.sh, t5502-quickfetch.sh, t5604-clone-reference.sh, t6014-rev-list-all.sh, t7701-repack-unpack-unreachable.sh: In b0c61be3209 (Merge branch 'rs/reflog-expiry-cleanup', 2022-12-26) - t3800-mktag.sh, t5302-pack-index.sh, t5306-pack-nobase.sh, t5573-pull-verify-signatures.sh, t7612-merge-verify-signatures.sh: In 69bbbe484ba (hash-object: use fsck for object checks, 2023-01-18). - t1451-fsck-buffer.sh: In 8e4309038f0 (fsck: do not assume NUL-termination of buffers, 2023-01-19). - t6501-freshen-objects.sh: In abf2bb895b4 (Merge branch 'jk/hash-object-fsck', 2023-01-30) 1. 9ea1378d046 (Merge branch 'ab/various-leak-fixes', 2022-12-14) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21t: convert egrep usage to "grep -E"Đoàn Trần Công Danh
Despite POSIX states that: > The old egrep and fgrep commands are likely to be supported for many > years to come as implementation extensions, allowing historical > applications to operate unmodified. GNU grep 3.8 started to warn[1]: > The egrep and fgrep commands, which have been deprecated since > release 2.5.3 (2007), now warn that they are obsolescent and should > be replaced by grep -E and grep -F. Prepare for their removal in the future. [1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-20t7[5-9]*: adjust the references to the default branch name "main"Johannes Schindelin
Excluding t7817, which is added in an unrelated patch series at the time of writing, this adjusts t7[5-9]*. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t7[5-9]*.sh) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-20tests: mark tests relying on the current default for `init.defaultBranch`Johannes Schindelin
In addition to the manual adjustment to let the `linux-gcc` CI job run the test suite with `master` and then with `main`, this patch makes sure that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts that currently rely on the initial branch name being `master by default. To determine which test scripts to mark up, the first step was to force-set the default branch name to `master` in - all test scripts that contain the keyword `master`, - t4211, which expects `t/t4211/history.export` with a hard-coded ref to initialize the default branch, - t5560 because it sources `t/t556x_common` which uses `master`, - t8002 and t8012 because both source `t/annotate-tests.sh` which also uses `master`) This trick was performed by this command: $ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' $(git grep -l master t/t[0-9]*.sh) \ t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh After that, careful, manual inspection revealed that some of the test scripts containing the needle `master` do not actually rely on a specific default branch name: either they mention `master` only in a comment, or they initialize that branch specificially, or they do not actually refer to the current default branch. Therefore, the aforementioned modification was undone in those test scripts thusly: $ git checkout HEAD -- \ t/t0027-auto-crlf.sh t/t0060-path-utils.sh \ t/t1011-read-tree-sparse-checkout.sh \ t/t1305-config-include.sh t/t1309-early-config.sh \ t/t1402-check-ref-format.sh t/t1450-fsck.sh \ t/t2024-checkout-dwim.sh \ t/t2106-update-index-assume-unchanged.sh \ t/t3040-subprojects-basic.sh t/t3301-notes.sh \ t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \ t/t3436-rebase-more-options.sh \ t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \ t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \ t/t5511-refspec.sh t/t5526-fetch-submodules.sh \ t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \ t/t5548-push-porcelain.sh \ t/t5552-skipping-fetch-negotiator.sh \ t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \ t/t5614-clone-submodules-shallow.sh \ t/t7508-status.sh t/t7606-merge-custom.sh \ t/t9302-fast-import-unpack-limit.sh We excluded one set of test scripts in these commands, though: the range of `git p4` tests. The reason? `git p4` stores the (foreign) remote branch in the branch called `p4/master`, which is obviously not the default branch. Manual analysis revealed that only five of these tests actually require a specific default branch name to pass; They were modified thusly: $ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' t/t980[0167]*.sh t/t9811*.sh Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-25Merge branch 'ps/test-chmtime-get'Junio C Hamano
Test cleanup. * ps/test-chmtime-get: t/helper: 'test-chmtime (--get|-g)' to print only the mtime
2018-04-09t/helper: 'test-chmtime (--get|-g)' to print only the mtimePaul-Sebastian Ungureanu
Compared to 'test-chmtime -v +0 file' which prints the mtime and and the file name, 'test-chmtime --get file' displays only the mtime. If it is used in combination with (+|=|=+|=-|-)seconds, it changes and prints the new value. test-chmtime -v +0 file | sed 's/[^0-9].*$//' is now equivalent to: test-chmtime --get file Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27t/helper: merge test-chmtime into test-toolNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14repack: extend --keep-unreachable to loose objectsJeff King
If you use "repack -adk" currently, we will pack all objects that are already packed into the new pack, and then drop the old packs. However, loose unreachable objects will be left as-is. In theory these are meant to expire eventually with "git prune". But if you are using "repack -k", you probably want to keep things forever and therefore do not run "git prune" at all. Meaning those loose objects may build up over time and end up fooling any object-count heuristics (such as the one done by "gc --auto", though since git-gc does not support "repack -k", this really applies to whatever custom scripts people might have driving "repack -k"). With this patch, we instead stuff any loose unreachable objects into the pack along with the already-packed unreachable objects. This may seem wasteful, but it is really no more so than using "repack -k" in the first place. We are at a slight disadvantage, in that we have no useful ordering for the result, or names to hand to the delta code. However, this is again no worse than what "repack -k" is already doing for the packed objects. The packing of these objects doesn't matter much because they should not be accessed frequently (unless they actually _do_ become referenced, but then they would get moved to a different part of the packfile during the next repack). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14repack: add --keep-unreachable optionJeff King
The usual way to do a full repack (and what is done by git-gc) is to run "repack -Ad --unpack-unreachable=<when>", which will loosen any unreachable objects newer than "<when>", and drop any older ones. This is a safer alternative to "repack -ad", because "<when>" becomes a grace period during which we will not drop any new objects that are about to be referenced. However, it isn't perfectly safe. It's always possible that a process is about to reference an old object. Even if that process were to take care to update the timestamp on the object, there is no atomicity with a simultaneously running "repack" process. So while unlikely, there is a small race wherein we may drop an object that is in the process of being referenced. If you do automated repacking on a large number of active repositories, you may hit it eventually, and the result is a corrupted repository. It would be nice to fix that race in the long run, but it's complicated. In the meantime, there is a much simpler strategy for automated repository maintenance: do not drop objects at all. We already have a "--keep-unreachable" option in pack-objects; we just need to plumb it through from git-repack. Note that this _isn't_ plumbed through from git-gc, so at this point it's strictly a tool for people doing their own advanced repository maintenance strategy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-25t7701: fix ignored exit code inside loopJeff King
When checking a list of file mtimes, we use a loop and break out early from the loop if any entry does not match. However, the exit code of a loop exited via break is always 0, meaning that the test will fail to notice we had a mismatch. Since the loop is inside a function, we can fix this by doing an early "return 1". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-20repack: pack objects mentioned by the indexJeff King
When we pack all objects, we use only the objects reachable from references and reflogs. This misses any objects which are reachable from the index, but not yet referenced. By itself this isn't a big deal; the objects can remain loose until they are actually used in a commit. However, it does create a problem when we drop packed but unreachable objects. We try to optimize out the writing of objects that we will immediately prune, which means we must follow the same rules as prune in determining what is reachable. And prune uses the index for this purpose. This is rather uncommon in practice, as objects in the index would not usually have been packed in the first place. But it could happen in a sequence like: 1. You make a commit on a branch that references blob X. 2. You repack, moving X into the pack. 3. You delete the branch (and its reflog), so that X is unreferenced. 4. You "git add" blob X so that it is now referenced only by the index. 5. You repack again with git-gc. The pack-objects we invoke will see that X is neither referenced nor recent and not bother loosening it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11gc: do not explode objects which will be immediately prunedJeff King
When we pack everything into one big pack with "git repack -Ad", any unreferenced objects in to-be-deleted packs are exploded into loose objects, with the intent that they will be examined and possibly cleaned up by the next run of "git prune". Since the exploded objects will receive the mtime of the pack from which they come, if the source pack is old, those loose objects will end up pruned immediately. In that case, it is much more efficient to skip the exploding step entirely for these objects. This patch teaches pack-objects to receive the expiration information and avoid writing these objects out. It also teaches "git gc" to pass the value of gc.pruneexpire to repack (which in turn learns to pass it along to pack-objects) so that this optimization happens automatically during "git gc" and "git gc --auto". Signed-off-by: Jeff King <peff@peff.net> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-18more war on "sleep" in testsJunio C Hamano
Two more tests that sleep only to waste tick can be converted to use test_tick and take expiry parameters relative to $test_tick. The basic idea is to replace "sleep 1" with "test_tick" to cause the "time" to pass. These tests are interested in expiring things with "now" as the timestamp, soo use a timestamp relative to $test_tick to give them more stability and reproducibility. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-28Windows: Fix intermittent failures of t7701Johannes Sixt
The last test case checks whether unpacked objects receive the time stamp of the pack file. Due to different implementations of stat(2) by MSYS and our version in compat/mingw.c, the test fails in about half of the test runs. Note the following facts: - The test uses perl's -M operator to compare the time stamps. Since we depend on MSYS perl, the result of this operator is based on MSYS's implementation of the stat(2) call. - NTFS on Windows records fractional seconds. - The MSYS implementation of stat(2) *rounds* fractional seconds to full seconds instead of truncating them. This becomes obvious by comparing the modification times reported by 'ls --full-time $f' and 'stat $f' for various files $f. - Our implementation of stat(2) in compat/mingw.c *truncates* to full seconds. The consequence of this is that - add_packed_git() picks up a truncated whole second modification time from the pack file time stamp, which is then used for the loose objects, while the pack file retains its time stamp in fractional seconds; - but the test case compared the pack file's rounded modification times to the loose objects' truncated modification times. And half of the time the rounded modification time is not the same as its truncated modification time. The fix is that we replace perl by 'test-chmtime -v +0', which prints the truncated whole-second mtime without modifying it. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-16Merge branch 'bc/maint-keep-pack'Junio C Hamano
* bc/maint-keep-pack: repack: only unpack-unreachable if we are deleting redundant packs
2008-11-15repack: only unpack-unreachable if we are deleting redundant packsBrandon Casey
The -A option calls pack-objects with the --unpack-unreachable option so that the unreachable objects in local packs are left in the local object store loose. But if the -d option to repack was _not_ used, then these unpacked loose objects are redundant and unnecessary. Update tests in t7701. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-30tests: grep portability fixesJeff King
We try to avoid using the "-q" or "-e" options, as they are largely useless, as explained in aadbe44f. There is one exception for "-e" here, which is in t7701 used to produce an "or" of patterns. This can be rewritten as an egrep pattern. This patch also removes use of "grep -F" in favor of the more widely available "fgrep". [sp: Tested on AIX 5.3 by Mike Ralphson, Tested on MinGW by Johannes Sixt] Signed-off-by: Jeff King <peff@peff.net> Tested-by: Mike Ralphson <mike@abacus.co.uk> Tested-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-09-04tests: use "git xyzzy" form (t7200 - t9001)Nanako Shiraishi
Converts tests between t7201-t9001. Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-09t7701-repack-unpack-unreachable.sh: check timestamp of unpacked objectsBrandon Casey
Unpacked objects should receive the timestamp of the pack they were unpacked from. Check. Signed-off-by: Brandon Casey <drafnel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-11repack: modify behavior of -A option to leave unreferenced objects unpackedBrandon Casey
The previous behavior of the -A option was to retain any previously packed objects which had become unreferenced, and place them into the newly created pack file. Since git-gc, when run automatically with the --auto option, calls repack with the -A option, this had the effect of retaining unreferenced packed objects indefinitely. To avoid this scenario, the user was required to run git-gc with the little known --prune option or to manually run repack with the -a option. This patch changes the behavior of the -A option so that unreferenced objects that exist in any pack file being replaced, will be unpacked into the repository. The unreferenced loose objects can then be garbage collected by git-gc (i.e. git-prune) based on the gc.pruneExpire setting. Also add new tests for checking whether unreferenced objects which were previously packed are properly left in the repository unpacked after repacking. Signed-off-by: Brandon Casey <drafnel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>