Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-11-29Merge branch 'gc/resolve-alternate-symlinks'Junio C Hamano
Resolve symbolic links when processing the locations of alternate object stores, since failing to do so can lead to confusing and buggy behavior. * gc/resolve-alternate-symlinks: object-file: use real paths when adding alternates
2022-11-25object-file: use real paths when adding alternatesGlen Choo
When adding an alternate ODB, we check if the alternate has the same path as the object dir, and if so, we do nothing. However, that comparison does not resolve symlinks. This makes it possible to add the object dir as an alternate, which may result in bad behavior. For example, it can trick "git repack -a -l -d" (possibly run by "git gc") into thinking that all packs come from an alternate and delete all objects. rm -rf test && git clone https://github.com/git/git test && ( cd test && ln -s objects .git/alt-objects && # -c repack.updateserverinfo=false silences a warning about not # being able to update "info/refs", it isn't needed to show the # bad behavior GIT_ALTERNATE_OBJECT_DIRECTORIES=".git/alt-objects" git \ -c repack.updateserverinfo=false repack -a -l -d && # It's broken! git status # Because there are no more objects! ls .git/objects/pack ) Fix this by resolving symlinks and relative paths before comparing the alternate and object dir. This lets us clean up a number of issues noted in 37a95862c6 (alternates: re-allow relative paths from environment, 2016-11-07): - Now that we compare the real paths, duplicate detection is no longer foiled by relative paths. - Using strbuf_realpath() allows us to "normalize" paths that strbuf_normalize_path() can't, so we can stop silently ignoring errors when "normalizing" paths from the environment. - We now store an absolute path based on getcwd() (the "future direction" named in 37a95862c6), so chdir()-ing in the process no longer changes the directory pointed to by the alternate. This is a change in behavior, but a desirable one. Signed-off-by: Glen Choo <chooglen@google.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-19Merge branch 'tb/repack-expire-to'Taylor Blau
"git repack" learns to send cruft objects out of the way into packfiles outside the repository. * tb/repack-expire-to: builtin/repack.c: implement `--expire-to` for storing pruned objects builtin/repack.c: write cruft packs to arbitrary locations builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack` builtin/repack.c: pass "out" to `prepare_pack_objects`
2022-10-31Merge branch 'jk/repack-tempfile-cleanup'Taylor Blau
The way "git repack" creared temporary files when it received a signal was prone to deadlocking, which has been corrected. * jk/repack-tempfile-cleanup: t7700: annotate cruft-pack failure with ok=sigpipe repack: drop remove_temporary_files() repack: use tempfiles for signal cleanup repack: expand error message for missing pack files repack: populate extension bits incrementally repack: convert "names" util bitfield to array
2022-10-28Merge branch 'tb/remove-unused-pack-bitmap'Junio C Hamano
When creating a multi-pack bitmap, remove per-pack bitmap files unconditionally as they will never be consulted. * tb/remove-unused-pack-bitmap: builtin/repack.c: remove redundant pack-based bitmaps
2022-10-24builtin/repack.c: implement `--expire-to` for storing pruned objectsTaylor Blau
When pruning objects with `--cruft`, `git repack` offers some flexibility when selecting the set of which objects are pruned via the `--cruft-expiration` option. This is useful for expiring objects which are older than the grace period, making races where to-be-pruned objects become reachable and then ancestors of freshly pushed objects, leaving the repository in a corrupt state after pruning substantially less likely [1]. But in practice, such races are impossible to avoid entirely, no matter how long the grace period is. To prevent this race, it is often advisable to temporarily put a repository into a read-only state. But in practice, this is not always practical, and so some middle ground would be nice. This patch introduces a new option, `--expire-to`, which teaches `git repack` to write an additional cruft pack containing just the objects which were pruned from the repository. The caller can specify a directory outside of the current repository as the destination for this second cruft pack. This makes it possible to prune objects from a repository, while still holding onto a supplemental copy of them outside of the original repository. Having this copy on-disk makes it substantially easier to recover objects when the aforementioned race is encountered. `--expire-to` is implemented in a somewhat convoluted manner, which is to take advantage of the fact that the first time `write_cruft_pack()` is called, it adds the name of the cruft pack to the `names` string list. That means the second time we call `write_cruft_pack()`, objects in the previously-written cruft pack will be excluded. As long as the caller ensures that no objects are expired during the second pass, this is sufficient to generate a cruft pack containing all objects which don't appear in any of the new packs written by `git repack`, including the cruft pack. In other words, all of the objects which are about to be pruned from the repository. It is important to note that the destination in `--expire-to` does not necessarily need to be a Git repository (though it can be) Notably, the expired packs do not contain all ancestors of expired objects. So if the source repository contains something like: <unreachable> / C1 --- C2 \ refs/heads/master where C2 is unreachable, but has a parent (C1) which is reachable, and C2 would be pruned, then the expiry pack will contain only C2, not C1. [1]: https://lore.kernel.org/git/20190319001829.GL29661@sigill.intra.peff.net/ Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-23t7700: annotate cruft-pack failure with ok=sigpipeJeff King
One of our tests intentionally causes the cruft-pack generation phase of repack to fail, in order to stimulate an exit from repack at the desired moment. It does so by feeding a bogus option argument to pack-objects. This is a simple and reliable way to get pack-objects to fail, but it has one downside: pack-objects will die before reading its stdin, which means the caller repack may racily get SIGPIPE writing to it. For the purposes of this test, that's OK. We are checking whether repack cleans up already-created .tmp files, and it will do so whether it exits or dies by signal (because the tempfile API hooks both). But we have to tell test_must_fail that either outcome is OK, or it complains about the signal. Arguably this is a workaround (compared to fixing repack), as repack dying to SIGPIPE means that it loses the opportunity to give a more detailed message. But we don't actually write such a message anyway; we rely on pack-objects to have written something useful to stderr, and it does. In either case (signal or exit), that is the main thing the user will see. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-22repack: drop remove_temporary_files()Jeff King
After we've successfully finished the repack, we call remove_temporary_files(), which looks for and removes any files matching ".tmp-$$-pack-*", where $$ is the pid of the current process. But this is pointless. If we make it this far in the process, we've already renamed these tempfiles into place, and there is nothing left to delete. Nor is there a point in trying to call it to clean up when we _aren't_ successful. It's not safe for using in a signal handler, and the previous commit already handed that job over to the tempfile API. It might seem like it would be useful to clean up stray .tmp files left by other invocations of git-repack. But it won't clean those files; it only matches ones with its pid, and leaves the rest. Fortunately, those are cleaned up naturally by successive calls to git-repack; we'll consider .tmp-*.pack the same as normal packfiles, so "repack -ad", etc, will roll up their contents and eventually delete them. The one case that could matter is if pack-objects generates an extension we don't know about, like ".tmp-pack-$$-$hash.some-new-ext". The current code will quietly delete such a file, while after this patch we'd leave it in place. In practice this doesn't happen, and would be indicative of a bug. Leaving the file as cruft is arguably a better behavior, as it means somebody is more likely to eventually notice and fix the bug. If we really wanted to be paranoid, we could scan for and warn about such files, but that seems like overkill. There's nothing to test with regard to the removal of this function. It was doing nothing, so the behavior should be the same. However, we can verify (and protect) our assumption that "repack -ad" will eventually remove stray files by adding a test for that. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-22repack: use tempfiles for signal cleanupJeff King
When git-repack exits due to a signal, it tries to clean up by calling its remove_temporary_files() function, which walks through the packs dir looking for ".tmp-$$-pack-*" files to delete (where "$$" is the pid of the current process). The biggest problem here is that remove_temporary_files() is not safe to call in a signal handler. It uses opendir(), which isn't on the POSIX async-signal-safe list. The details will be platform-specific, but a likely issue is that it needs to allocate memory; if we receive a signal while inside malloc(), etc, we'll conflict on the allocator lock and deadlock with ourselves. We can fix this by just cleaning up the files directly, without walking the directory. We already know the complete list of .tmp-* files that were generated, because we recorded them via populate_pack_exts(). When we find files there, we can use register_tempfile() to record the filenames. If we receive a signal, then the tempfile API will clean them up for us, and it's async-safe and pretty battle-tested. Note that this is slightly racier than the existing scheme. We don't record the filenames until pack-objects tells us the hash over stdout. So during the period between it generating the file and reporting the hash, we'd fail to clean up. However, that period is very small. During most of the pack generation process pack-objects is using its own internal tempfiles. It's only at the very end that it moves them into the names git-repack expects, and then it immediately reports the name to us. Given that cleanup like this is best effort (after all, we may get SIGKILL), this level of race is acceptable. When we register the tempfiles, we'll record them locally and use the result to call rename_tempfile(), rather than renaming by hand. This isn't strictly necessary, as once we've renamed the files they're gone, and the tempfile API's cleanup unlink() would simply become a pointless noop. But managing the lifetimes of the tempfile objects is the cleanest thing to do, and the tempfile pointers naturally fill the same role as the old booleans. This patch also fixes another small problem. We only hook signals, and don't set up an atexit handler. So if we see an error that causes us to die(), we'll leave the .tmp-* files in place. But since the tempfile API handles this for us, this is now fixed for free. The new test covers this by stimulating a failure of pack-objects when generating a cruft pack. Before this patch, the .tmp-* file for the main pack would have been left, but now we correctly clean it up. Two small subtleties on the implementation: - in the renaming loop, we can stop re-constructing fname_old; we only use it when we have a tempfile to rename, so we can just ask the tempfile for its path (which, barring bugs, should be identical) - when renaming fails, our error message mentions fname_old. But since a failed rename_tempfile() invalidates the tempfile struct, we'll lose access to that string. Instead, let's mention the destination filename, which is what most other callers do. Reported-by: Jan Pokorný <poki@fnusa.cz> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18repack: don't remove .keep packs with `--pack-kept-objects`Taylor Blau
`git repack` supports a `--pack-kept-objects` flag which more or less translates to whether or not we pass `--honor-pack-keep` down to `git pack-objects` when assembling a new pack. This behavior has existed since ee34a2bead (repack: add `repack.packKeptObjects` config var, 2014-03-03). In that commit, the documentation was extended to say: [...] Note that we still do not delete `.keep` packs after `pack-objects` finishes. Unfortunately, this is not the case when `--pack-kept-objects` is combined with a `--geometric` repack. When doing a geometric repack, we include `.keep` packs when enumerating available packs only when `pack_kept_objects` is set. So this all works fine when `--no-pack-kept-objects` (or similar) is given. Kept packs are excluded from the geometric roll-up, so when we go to delete redundant packs (with `-d`), no `.keep` packs appear "below the split" in our geometric progression. But when `--pack-kept-objects` is given, things can go awry. Namely, when a kept pack is included in the list of packs tracked by the `pack_geometry` struct *and* part of the pack roll-up, we will delete the `.keep` pack when we shouldn't. Note that this *doesn't* result in object corruption, since the `.keep` pack's objects are still present in the new pack. But the `.keep` pack itself is removed, which violates our promise from back in ee34a2bead. But there's more. Because `repack` computes the geometric roll-up independently from selecting which packs belong in a MIDX (with `--write-midx`), this can lead to odd behavior. Consider when a `.keep` pack appears below the geometric split (ie., its objects will be part of the new pack we generate). We'll write a MIDX containing the new pack along with the existing `.keep` pack. But because the `.keep` pack appears below the geometric split line, we'll (incorrectly) try to remove it. While this doesn't corrupt the repository, it does cause us to remove the MIDX we just wrote, since removing that pack would invalidate the new MIDX. Funny enough, this behavior became far less noticeable after e4d0c11c04 (repack: respect kept objects with '--write-midx -b', 2021-12-20), which made `pack_kept_objects` be enabled by default only when we were writing a non-MIDX bitmap. But e4d0c11c04 didn't resolve this bug, it just made it harder to notice unless callers explicitly passed `--pack-kept-objects`. The solution is to avoid trying to remove `.keep` packs during `--geometric` repacks, even when they appear below the geometric split line, which is the approach this patch implements. Co-authored-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18builtin/repack.c: remove redundant pack-based bitmapsTaylor Blau
When we write a MIDX bitmap after repacking, it is possible that the repository would be left in a state with both pack- and multi-pack reachability bitmaps. This can occur, for instance, if a pack that was kept (either by having a .keep file, or during a geometric repack in which it is not rolled up) has a bitmap file, and the repack wrote a multi-pack index and bitmap. When loading a reachability bitmap for the repository, the multi-pack one is always preferred, so the pack-based one is redundant. Let's remove it unconditionally, even if '-d' isn't passed, since there is no practical reason to keep both around. The patch below does just that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-25t7700: check post-condition in kept-pack testDerrick Stolee
The '--write-midx -b packs non-kept objects' test in t7700-repack.sh uses test_subcommand_inexact to check that 'git repack' properly adds the '--honor-pack-keep' flag to the 'git pack-objects' subcommand. However, the test_subcommand_inexact helper is more flexible than initially designed, and this instance is the only one that makes use of it: there are additional arguments between 'git pack-objects' and the '--honor-pack-keep' flag. In order to make test_subcommand_inexact more strict, we need to fix this instance. This test checks that 'git repack --write-midx -a -b -d' will create a new pack-file that does not contain the objects within the kept pack. This behavior is possible because of the multi-pack-index bitmap that will bitmap objects against multiple packs. Without --write-midx, the objects in the kept pack would be duplicated so the resulting pack is closed under reachability and bitmaps can be created against it. This is discussed in more detail in e4d0c11c0 (repack: respect kept objects with '--write-midx -b', 2021-12-20) which also introduced this instance of test_subcommand_inexact. To better verify the intended post-conditions while also removing this instance of test_subcommand_inexact, rewrite the test to check the list of packed objects in the kept pack and the list of the objects in the newly-repacked pack-file _other_ than the kept pack. These lists should be disjoint. Be sure to include a non-kept pack-file and loose objects to be extra careful that this is properly behaving with kept packs and not just avoiding repacking all pack-files. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-24Merge branch 'ps/repack-with-server-info'Junio C Hamano
"git repack" learned a new configuration to disable triggering of age-old "update-server-info" command, which is rarely useful these days. * ps/repack-with-server-info: repack: add config to skip updating server info repack: refactor to avoid double-negation of update-server-info
2022-03-15repack: add config to skip updating server infoPatrick Steinhardt
By default, git-repack(1) will update server info that is required by the dumb HTTP transport. This can be skipped by passing the `-n` flag, but what we're noticably missing is a config option to permanently disable updating this information. Add a new option "repack.updateServerInfo" which can be used to disable the logic. Most hosting providers have turned off the dumb HTTP protocol anyway, and on the client-side it woudln't typically be useful either. Giving a persistent way to disable this feature thus makes quite some sense to avoid wasting compute cycles and storage. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-15repack: refactor to avoid double-negation of update-server-infoPatrick Steinhardt
By default, git-repack(1) runs `update_server_info()` to generate info required for the dumb HTTP protocol. This can be disabled via the `-n` flag, which then sets the `no_update_server_info` flag. Further down the code this leads to some double-negation logic, which is about to become more confusing as we're about to add a new config which allows the user to permanently disable generation of the info. Refactor the code to avoid the double-negation and add some tests which verify that the flag continues to work as expected. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-17Merge branch 'tb/midx-bitmap-corruption-fix'Junio C Hamano
A bug that made multi-pack bitmap and the object order out-of-sync, making the .midx data corrupt, has been fixed. * tb/midx-bitmap-corruption-fix: pack-bitmap.c: gracefully fallback after opening pack/MIDX midx: read `RIDX` chunk when present t/lib-bitmap.sh: parameterize tests over reverse index source t5326: move tests to t/lib-bitmap.sh t5326: extract `test_rev_exists` t5326: drop unnecessary setup pack-revindex.c: instrument loading on-disk reverse index midx.c: make changing the preferred pack safe t5326: demonstrate bitmap corruption after permutation
2022-01-27midx: read `RIDX` chunk when presentTaylor Blau
When a MIDX contains the new `RIDX` chunk, ensure that the reverse index is read from it instead of the on-disk .rev file. Since we need to encode the object order in the MIDX itself for correctness reasons, there is no point in storing the same data again outside of the MIDX. So, this patch stops writing separate .rev files, and reads it out of the MIDX itself. This is possible to do with relatively little new code, since the format of the RIDX chunk is identical to the data in the .rev file. In other words, we can implement this by pointing the `revindex_data` field at the reverse index chunk of the MIDX instead of the .rev file without any other changes. Note that we have two knobs that are adjusted for the new tests: GIT_TEST_MIDX_WRITE_REV and GIT_TEST_MIDX_READ_RIDX. The former controls whether the MIDX .rev is written at all, and the latter controls whether we read the MIDX's RIDX chunk. Both are necessary to ensure that the test added at the beginning of this series continues to work. This is because we always need to write the RIDX chunk in the MIDX in order to change its checksum, but we want to make sure reading the existing .rev file still works (since the RIDX chunk takes precedence by default). Arguably this isn't a very interesting mode to test, because the precedence rules mean that we'll always read the RIDX chunk over the .rev file. But it makes it impossible for a user to induce corruption in their repository by adjusting the test knobs (since if we had an either/or knob they could stop writing the RIDX chunk, allowing them to tweak the MIDX's object order without changing its checksum). Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-06Merge branch 'ds/repack-fixlets'Junio C Hamano
Two fixes around "git repack". * ds/repack-fixlets: repack: make '--quiet' disable progress repack: respect kept objects with '--write-midx -b'
2021-12-20repack: make '--quiet' disable progressDerrick Stolee
While testing some ideas in 'git repack', I ran it with '--quiet' and discovered that some progress output was still shown. Specifically, the output for writing the multi-pack-index showed the progress. The 'show_progress' variable in cmd_repack() is initialized with isatty(2) and is not modified at all by the '--quiet' flag. The '--quiet' flag modifies the po_args.quiet option which is translated into a '--quiet' flag for the 'git pack-objects' child process. However, 'show_progress' is used to directly send progress information to the multi-pack-index writing logic which does not use a child process. The fix here is to modify 'show_progress' to be false if po_opts.quiet is true, and isatty(2) otherwise. This new expectation simplifies a later condition that checks both. Update the documentation to make it clear that '-q' will disable all progress in addition to ensuring the 'git pack-objects' child process will receive the flag. Use 'test_terminal' to check that this works to get around the isatty(2) check. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-20repack: respect kept objects with '--write-midx -b'Derrick Stolee
Historically, we needed a single packfile in order to have reachability bitmaps. This introduced logic that when 'git repack' had a '-b' option that we should stop sending the '--honor-pack-keep' option to the 'git pack-objects' child process, ensuring that we create a packfile containing all reachable objects. In the world of multi-pack-index bitmaps, we no longer need to repack all objects into a single pack to have valid bitmaps. Thus, we should continue sending the '--honor-pack-keep' flag to 'git pack-objects'. The fix is very simple: only disable the flag when writing bitmaps but also _not_ writing the multi-pack-index. This opens the door to new repacking strategies that might want to keep some historical set of objects in a stable pack-file while only repacking more recent objects. To test, create a new 'test_subcommand_inexact' helper that is more flexible than 'test_subcommand'. This allows us to look for the --honor-pack-keep flag without over-indexing on the exact set of arguments. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13t6000-t9999: detect and signal failure within loopEric Sunshine
Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `|| return 1` (or `|| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-02builtin/repack.c: pass `--refs-snapshot` when writing bitmapsTaylor Blau
To prevent the race described in an earlier patch, generate and pass a reference snapshot to the multi-pack bitmap code, if we are writing one from `git repack`. This patch is mostly limited to creating a temporary file, and then calling for_each_ref(). Except we try to minimize duplicates, since doing so can drastically reduce the size in network-of-forks style repositories. In the kernel's fork network (the repository containing all objects from the kernel and all its forks), deduplicating the references drops the snapshot size from 934 MB to just 12 MB. But since we're handling duplicates in this way, we have to make sure that we preferred references (those listed in pack.preferBitmapTips) before non-preferred ones (to avoid recording an object which is pointed at by a preferred tip as non-preferred). We accomplish this by doing separate passes over the references: first visiting each prefix in pack.preferBitmapTips, and then over the rest of the references. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-29builtin/repack.c: support writing a MIDX while repackingTaylor Blau
Teach `git repack` a new `--write-midx` option for callers that wish to persist a multi-pack index in their repository while repacking. There are two existing alternatives to this new flag, but they don't cover our particular use-case. These alternatives are: - Call 'git multi-pack-index write' after running 'git repack', or - Set 'GIT_TEST_MULTI_PACK_INDEX=1' in your environment when running 'git repack'. The former works, but introduces a gap in bitmap coverage between repacking and writing a new MIDX (since the repack may have deleted a pack included in the existing MIDX, invalidating it altogether). Setting the 'GIT_TEST_' environment variable is obviously unsupported. In fact, even if it were supported officially, it still wouldn't work, because it generates the MIDX *after* redundant packs have been dropped, leading to the same issue as above. Introduce a new option which eliminates this race by teaching `git repack` to generate the MIDX at the critical point: after the new packs have been written and moved into place, but before the redundant packs have been removed. This option is compatible with `git repack`'s '--bitmap' option (it changes the interpretation to be: "write a bitmap corresponding to the MIDX after one has been generated"). There is a little bit of additional noise in the patch below to avoid repeating ourselves when selecting which packs to delete. Instead of a single loop as before (where we iterate over 'existing_packs', decide if a pack is worth deleting, and if so, delete it), we have two loops (the first where we decide which ones are worth deleting, and the second where we actually do the deleting). This makes it so we have a single check we can make consistently when (1) telling the MIDX which packs we want to exclude, and (2) actually unlinking the redundant packs. There is also a tiny change to short-circuit the body of write_midx_included_packs() when no packs remain in the case of an empty repository. The MIDX code does not handle this, so avoid trying to generate a MIDX covering zero packs in the first place. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01t7700: update to work with MIDX bitmap test knobTaylor Blau
A number of these tests are focused only on pack-based bitmaps and need to be updated to disable 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' where necessary. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-05t7700: stop losing return codes of git commandsDenton Liu
In a pipe, only the return code of the last command is used. Thus, all other commands will have their return codes masked. Rewrite pipes so that there are no git commands upstream so that we will know if a command fails. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-05t7700: make references to SHA-1 genericDenton Liu
Make the test more hash-agnostic by renaming variables from "sha1" to some variation of "oid" or "packid". Also, replace the regex, `[0-9a-f]\{40\}` with `$OID_REGEX`. A better name for "incrpackid" (incremental pack-id) might have been just "packid". However, later in the test suite, we have other uses of "packid". Although the scopes of these variables don't conflict, a future developer may think that commit_and_pack() and test_has_duplicate_object() are semantically related somehow since they share the same variable name. Give them distinct names so that it's clear these uses are unrelated. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-05t7700: replace egrep with grepDenton Liu
The egrep expressions in this test suite were of the form `^$variable`. Although egrep works just fine, it's overkill since we're not using any extended regex. Replace egrep invocations with grep so that we aren't swatting flies with a sledgehammer. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-05t7700: consolidate code into test_has_duplicate_object()Denton Liu
The code to test that objects were not duplicated from the packfile was duplicated many times. Extract the duplicated code into test_has_duplicate_object() and use that instead. Refactor the resulting extraction so that if the git command fails, the return code is not silently lost. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-05t7700: consolidate code into test_no_missing_in_packs()Denton Liu
The code to test that objects were not missing from the packfile was duplicated many times. Extract the duplicated code into test_no_missing_in_packs() and use that instead. Refactor the resulting extraction so that if any git commands fail, their return codes are not silently lost. Instead of verifying each file of `alt_objects/pack/*.idx` individually in a for-loop, batch them together into one verification step. The original testing construct was O(n^2): it used a grep in a loop to test whether any objects were missing in the packfile. Rewrite this to extract the hash using sed or cut, sort the files, then use `comm -23` so that finding missing lines from the original file is done more efficiently. While we're at it, add a space to `commit_and_pack ()` for style. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-30t7700: s/test -f/test_path_is_file/Denton Liu
Since we have debugging-friendly alternatives to `test -f`, replace instances of `test -f` with `test_path_is_file` so that if a command ever fails, we get better debugging information. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-30t7700: move keywords onto their own lineDenton Liu
The code style for tests is to have statements on their own line if possible. Move keywords onto their own line so that they conform with the test style. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-30t7700: remove spaces after redirect operatorsDenton Liu
For shell scripts, the usual convention is for there to be no space after redirection operators, (e.g. `>file`, not `> file`). Remove these spaces wherever they appear. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-30t7700: drop redirections to /dev/nullDenton Liu
Since output is silenced when running without `-v` and debugging output is useful with `-v`, remove redirections to /dev/null as it is not useful. In one case where the output of stdout is consumed, redirect the output of test_commit to stderr. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-31repack: simplify handling of auto-bitmaps and .keep filesJeff King
Commit 7328482253 (repack: disable bitmaps-by-default if .keep files exist, 2019-06-29) taught repack to prefer disabling bitmaps to duplicating objects (unless bitmaps were asked for explicitly). But there's an easier way to do this: if we keep passing the --honor-pack-keep flag to pack-objects when auto-enabling bitmaps, then pack-objects already makes the same decision (it will disable bitmaps rather than duplicate). Better still, pack-objects can actually decide to do so based not just on the presence of a .keep file, but on whether that .keep file actually impacts the new pack we're making (so if we're racing with a push or fetch, for example, their temporary .keep file will not block us from generating bitmaps if they haven't yet updated their refs). And because repack uses the --write-bitmap-index-quiet flag, we don't have to worry about pack-objects generating confusing warnings when it does see a .keep file. We can confirm this by tweaking the .keep test to check repack's stderr. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-31repack: silence warnings when auto-enabled bitmaps cannot be builtJeff King
Depending on various config options, a full repack may not be able to build a reachability bitmap index (e.g., if pack.packSizeLimit forces us to write multiple packs). In these cases pack-objects may write a warning to stderr. Since 36eba0323d (repack: enable bitmaps by default on bare repos, 2019-03-14), we may generate these warnings even when the user did not explicitly ask for bitmaps. This has two downsides: - it can be confusing, if they don't know what bitmaps are - a daemonized auto-gc will write this to its log file, and the presence of the warning may suppress further auto-gc (until gc.logExpiry has elapsed) Let's have repack communicate to pack-objects that the choice to turn on bitmaps was not made explicitly by the user, which in turn allows pack-objects to suppress these warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-31t7700: clean up .keep file in bitmap-writing testJeff King
After our test snippet finishes, the .keep file is left in place, making it hard to do further tests of the auto-bitmap-writing code (since it suppresses the feature completely). Let's clean it up. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-01repack: disable bitmaps-by-default if .keep files existEric Wong
Bitmaps aren't useful with multiple packs, and users with .keep files ended up with redundant packs when bitmaps got enabled by default in bare repos. So detect when .keep files exist and stop enabling bitmaps by default in that case. Wasteful (but otherwise harmless) race conditions with .keep files documented by Jeff King still apply and there's a chance we'd still end up with redundant data on the FS: https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/ v2: avoid subshell in test case, be multi-index aware Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos") Signed-off-by: Eric Wong <e@80x24.org> Helped-by: Jeff King <peff@peff.net> Reported-by: Janos Farkas <chexum@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-18repack: enable bitmaps by default on bare reposEric Wong
A typical use case for bare repos is for serving clones and fetches to clients. Enable bitmaps by default on bare repos to make it easier for admins to host git repos in a performant way. Signed-off-by: Eric Wong <e@80x24.org> Helped-by: Jeff King <peff@peff.net> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16repack: add --keep-pack optionNguyễn Thái Ngọc Duy
We allow to keep existing packs by having companion .keep files. This is helpful when a pack is permanently kept. In the next patch, git-gc just wants to keep a pack temporarily, for one pack-objects run. git-gc can use --keep-pack for this use case. A note about why the pack_keep field cannot be reused and pack_keep_in_core has to be added. This is about the case when --keep-pack is specified together with either --keep-unreachable or --unpack-unreachable, but --honor-pack-keep is NOT specified. In this case, we want to exclude objects from the packs specified on command line, not from ones with .keep files. If only one bit flag is used, we have to clear pack_keep on pack files with the .keep file. But we can't make any assumption about unreachable objects in .keep packs. If "pack_keep" field is false for .keep packs, we could potentially pull lots of unreachable objects into the new pack, or unpack them loose. The safer approach is ignore all packs with either .keep file or --keep-pack. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16t7700: have closing quote of a test at the beginning of lineNguyễn Thái Ngọc Duy
The closing quote of a test body by convention is always at the start of line. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-08t/t7700-repack.sh: use the $( ... ) construct for command substitutionElia Pinto
The Git CodingGuidelines prefer the $(...) construct for command substitution instead of using the backquotes `...`. The backquoted form is the traditional method for command substitution, and is supported by POSIX. However, all but the simplest uses become complicated quickly. In particular, embedded command substitutions and/or the use of double quotes require careful escaping with the backslash character. The patch was generated by: for _f in $(find . -name "*.sh") do perl -i -pe 'BEGIN{undef $/;} s/`(.+?)`/\$(\1)/smg' "${_f}" done and then carefully proof-read. Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-25Merge branch 'jk/repack-pack-writebitmaps-config'Junio C Hamano
* jk/repack-pack-writebitmaps-config: t7700: drop explicit --no-pack-kept-objects from .keep test repack: introduce repack.writeBitmaps config option repack: simplify handling of --write-bitmap-index pack-objects: stop respecting pack.writebitmaps
2014-06-25Merge branch 'jk/repack-pack-keep-objects'Junio C Hamano
Recent updates to "git repack" started to duplicate objects that are in packfiles marked with .keep flag into the new packfile by mistake. * jk/repack-pack-keep-objects: repack: s/write_bitmap/&s/ in code repack: respect pack.writebitmaps repack: do not accidentally pack kept objects by default
2014-06-13t7700: drop explicit --no-pack-kept-objects from .keep testJeff King
We want to make sure that the default behavior of git-repack, without any options, continues to treat .keep files as it always has. Adding an explicit --no-pack-kept-objects, as ee34a2b did, is a much less interesting test, and prevented us from noticing the bug fixed by 64d3dc9 (repack: do not accidentally pack kept objects by default, 2014-06-10). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-11repack: introduce repack.writeBitmaps config optionJeff King
We currently have pack.writeBitmaps, which originally operated at the pack-objects level. This should really have been a repack.* option from day one. Let's give it the more sensible name, but keep the old version as a deprecated synonym. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-11repack: respect pack.writebitmapsJeff King
The config option to turn on bitmaps is read all the way down in the plumbing of pack-objects. This makes it hard for other options in the porcelain of repack to make decisions based on the bitmap setting. For example, repack.packKeptObjects tries to kick in by default only when bitmaps are turned on. But it can't do so reliably because it doesn't yet know whether we are using bitmaps. This patch teaches repack to respect pack.writebitmaps. It means we pass a redundant command-line flag to pack-objects, but that's OK; it shouldn't affect the outcome. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-11repack: do not accidentally pack kept objects by defaultJeff King
Commit ee34a2b (repack: add `repack.packKeptObjects` config var, 2014-03-03) added a flag which could duplicate kept objects, but did not mean to turn it on by default. Instead, the option is tied by default to the decision to write bitmaps, like: if (pack_kept_objects < 0) pack_kept_objects = write_bitmap; after which we expect pack_kept_objects to be a boolean 0 or 1. However, that assignment neglects that write_bitmap is _also_ a tri-state with "-1" as the default, and with neither option given, we accidentally turn the option on. This patch is the minimal fix to restore the desired behavior for the default state. Further patches will fix the more complicated cases. Note the update to t7700. It failed to turn on bitmaps, meaning we were actually confirming the wrong behavior! Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-19Merge branch 'jk/repack-pack-keep-objects'Junio C Hamano
* jk/repack-pack-keep-objects: repack: add `repack.packKeptObjects` config var
2014-03-04repack: add `repack.packKeptObjects` config varJeff King
The git-repack command always passes `--honor-pack-keep` to pack-objects. This has traditionally been a good thing, as we do not want to duplicate those objects in a new pack, and we are not going to delete the old pack. However, when bitmaps are in use, it is important for a full repack to include all reachable objects, even if they may be duplicated in a .keep pack. Otherwise, we cannot generate the bitmaps, as the on-disk format requires the set of objects in the pack to be fully closed. Even if the repository does not generally have .keep files, a simultaneous push could cause a race condition in which a .keep file exists at the moment of a repack. The repack may try to include those objects in one of two situations: 1. The pushed .keep pack contains objects that were already in the repository (e.g., blobs due to a revert of an old commit). 2. Receive-pack updates the refs, making the objects reachable, but before it removes the .keep file, the repack runs. In either case, we may prefer to duplicate some objects in the new, full pack, and let the next repack (after the .keep file is cleaned up) take care of removing them. This patch introduces both a command-line and config option to disable the `--honor-pack-keep` option. By default, it is triggered when pack.writeBitmaps (or `--write-bitmap-index` is turned on), but specifying it explicitly can override the behavior (e.g., in cases where you prefer .keep files to bitmaps, but only when they are present). Note that this option just disables the pack-objects behavior. We still leave packs with a .keep in place, as we do not necessarily know that we have duplicated all of their objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-24t7700: do not use "touch" unnecessarilyJeff King
Some versions of touch (such as /usr/ucb/touch on Solaris) do not know about the "-r" option. This would make sense as a feature of test-chmtime, but fortunately this fix is even easier. The test does not care about the timestamp of the .keep file it creates at all, only that it exists. For such a use case, with or without portability issues around "-r", "touch" should not be used in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>