Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/t
AgeCommit message (Collapse)Author
2020-07-30Merge branch 'jk/reject-newer-extensions-in-v0' into masterJunio C Hamano
With the base fix to 2.27 regresion, any new extensions in a v0 repository would still be silently honored, which is not quite right. Instead, complain and die loudly. * jk/reject-newer-extensions-in-v0: verify_repository_format(): complain about new extensions in v0 repo
2020-07-30Merge branch 'hn/reftable' into masterJunio C Hamano
Preliminary clean-up of the refs API in preparation for adding a new refs backend "reftable". * hn/reftable: reflog: cleanse messages in the refs.c layer bisect: treat BISECT_HEAD as a pseudo ref t3432: use git-reflog to inspect the reflog for HEAD lib-t6000.sh: write tag using git-update-ref
2020-07-30Merge branch 'bw/fail-cloning-into-non-empty' into masterJunio C Hamano
"git clone --separate-git-dir=$elsewhere" used to stomp on the contents of the existing directory $elsewhere, which has been taught to fail when $elsewhere is not an empty directory. * bw/fail-cloning-into-non-empty: git clone: don't clone into non-empty directory
2020-07-30Merge branch 'jk/tests-timestamp-fix' into masterJunio C Hamano
The test framework has been updated so that most tests will run with predictable (artificial) timestamps. * jk/tests-timestamp-fix: t9100: stop depending on commit timestamps test-lib: set deterministic default author/committer date t9100: explicitly unset GIT_COMMITTER_DATE t5539: make timestamp requirements more explicit t9700: loosen ident timezone regex t6000: use test_tick consistently
2020-07-30Merge branch 'ds/commit-graph-bloom-updates' into masterJunio C Hamano
Updates to the changed-paths bloom filter. * ds/commit-graph-bloom-updates: commit-graph: check all leading directories in changed path Bloom filters revision: empty pathspecs should not use Bloom filters revision.c: fix whitespace commit-graph: check chunk sizes after writing commit-graph: simplify chunk writes into loop commit-graph: unify the signatures of all write_graph_chunk_*() functions commit-graph: persist existence of changed-paths bloom: fix logic in get_bloom_filter() commit-graph: change test to die on parse, not load commit-graph: place bloom_settings in context
2020-07-30Merge branch 'sg/commit-graph-cleanups' into masterJunio C Hamano
The changed-path Bloom filter is improved using ideas from an independent implementation. * sg/commit-graph-cleanups: commit-graph: simplify write_commit_graph_file() #2 commit-graph: simplify write_commit_graph_file() #1 commit-graph: simplify parse_commit_graph() #2 commit-graph: simplify parse_commit_graph() #1 commit-graph: clean up #includes diff.h: drop diff_tree_oid() & friends' return value commit-slab: add a function to deep free entries on the slab commit-graph-format.txt: all multi-byte numbers are in network byte order commit-graph: fix parsing the Chunk Lookup table tree-walk.c: don't match submodule entries for 'submod/anything'
2020-07-19Merge branch 'dl/branch-cleanup' into masterJunio C Hamano
Last minute fix-up to tests for portability. * dl/branch-cleanup: t3200: don't grep for `strerror()` string
2020-07-18t3200: don't grep for `strerror()` stringMartin Ågren
In 6b7093064a ("t3200: test for specific errors", 2020-06-15), we learned to grep stderr to ensure that the failing `git branch` invocations fail for the right reason. In two of these tests, we grep for "File exists", expecting the string to show up there since config.c calls `error_errno()`, which ends up including `strerror(errno)` in the error message. But as we saw in 4605a73073 ("t1091: don't grep for `strerror()` string", 2020-03-08), there exists at least one implementation where `strerror()` yields a slightly different string than the one we're grepping for. In particular, these tests fail on the NonStop platform. Similar to 4605a73073, grep for the beginning of the string instead to avoid relying on `strerror()` behavior. Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-17Merge branch 'jn/v0-with-extensions-fix' into masterJunio C Hamano
In 2.28-rc0, we corrected a bug that some repository extensions are honored by mistake even in a version 0 repositories (these configuration variables in extensions.* namespace were supposed to have special meaning in repositories whose version numbers are 1 or higher), but this was a bit too big a change. * jn/v0-with-extensions-fix: repository: allow repository format upgrade with extensions Revert "check_repository_format_gently(): refuse extensions for old repositories"
2020-07-16verify_repository_format(): complain about new extensions in v0 repoJeff King
We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16repository: allow repository format upgrade with extensionsJonathan Nieder
Now that we officially permit repository extensions in repository format v0, permit upgrading a repository with extensions from v0 to v1 as well. For example, this means a repository where the user has set "extensions.preciousObjects" can use "git fetch --filter=blob:none origin" to upgrade the repository to use v1 and the partial clone extension. To avoid mistakes, continue to forbid repository format upgrades in v0 repositories with an unrecognized extension. This way, a v0 user using a misspelled extension field gets a chance to correct the mistake before updating to the less forgiving v1 format. While we're here, make the error message for failure to upgrade the repository format a bit shorter, and present it as an error, not a warning. Reported-by: Huan Huan Chen <huanhuanchen@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16Revert "check_repository_format_gently(): refuse extensions for old ↵Jonathan Nieder
repositories" This reverts commit 14c7fa269e42df4133edd9ae7763b678ed6594cd. The core.repositoryFormatVersion field was introduced in ab9cb76f661 (Repository format version check., 2005-11-25), providing a welcome bit of forward compatibility, thanks to some welcome analysis by Martin Atukunda. The semantics are simple: a repository with core.repositoryFormatVersion set to 0 should be comprehensible by all Git implementations in active use; and Git implementations should error out early instead of trying to act on Git repositories with higher core.repositoryFormatVersion values representing new formats that they do not understand. A new repository format did not need to be defined until 00a09d57eb8 (introduce "extensions" form of core.repositoryformatversion, 2015-06-23). This provided a finer-grained extension mechanism for Git repositories. In a repository with core.repositoryFormatVersion set to 1, Git implementations can act on "extensions.*" settings that modify how a repository is interpreted. In repository format version 1, unrecognized extensions settings cause Git to error out. What happens if a user sets an extension setting but forgets to increase the repository format version to 1? The extension settings were still recognized in that case; worse, unrecognized extensions settings do *not* cause Git to error out. So combining repository format version 0 with extensions settings produces in some sense the worst of both worlds. To improve that situation, since 14c7fa269e4 (check_repository_format_gently(): refuse extensions for old repositories, 2020-06-05) Git instead ignores extensions in v0 mode. This way, v0 repositories get the historical (pre-2015) behavior and maintain compatibility with Git implementations that do not know about the v1 format. Unfortunately, users had been using this sort of configuration and this behavior change came to many as a surprise: - users of "git config --worktree" that had followed its advice to enable extensions.worktreeConfig (without also increasing the repository format version) would find their worktree configuration no longer taking effect - tools such as copybara[*] that had set extensions.partialClone in existing repositories (without also increasing the repository format version) would find that setting no longer taking effect The behavior introduced in 14c7fa269e4 might be a good behavior if we were traveling back in time to 2015, but we're far too late. For some reason I thought that it was what had been originally implemented and that it had regressed. Apologies for not doing my research when 14c7fa269e4 was under development. Let's return to the behavior we've had since 2015: always act on extensions.* settings, regardless of repository format version. While we're here, include some tests to describe the effect on the "upgrade repository version" code path. [*] https://github.com/google/copybara/commit/ca76c0b1e13c4e36448d12c2aba4a5d9d98fb6e7 Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-15t9100: stop depending on commit timestampsJeff King
An earlier "fix" to this script gave up updating it not to rely on the current time because we cannot control what timestamp subversion gives its commits. We however could solve the issue in a different way and still use deterministic timestamps on Git commits. One fix would be to sort the list of trees before removing duplicates, but that loses information: - we do care that the fetched history is in the same order - there's a tree which appears twice in the history, and we'd want to make sure that it's there both times So instead, let's de-duplicate using a hash (preserving the order), and drop only lines with identical trees and subjects (preserving the tree which appears twice, since it has different subjects each time). Signed-off-by: Jeff King <peff@peff.net> Acked-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-15test-lib: set deterministic default author/committer dateJeff King
We always set the name and email for committer and author idents to make the test suite more deterministic, but not timestamps. Many scripts use test_tick to get consistent and sensibly incrementing timestamps as they create commits. But other scripts don't particularly care about the timestamp, and are happy to use whatever the current system time is. This non-determinism can be annoying: - when debugging a test, comparing results between two runs can be difficult, because the commit ids change - this can sometimes cause tests to be racy. E.g., traversal order depends on timestamp order. Even in a well-ordered set of commands, because our timestamp granularity is one second, two commits might sometimes have the same timestamp and sometimes differ. Let's set a default timestamp for all scripts to use. Any that use test_tick already will be unaffected (because their first test_tick call will overwrite our default), but it will make things a bit more deterministic for those that don't. We should be able to choose any time we want here. I picked this one because: - it differs from the initial test_tick default, which may make it easier to distinguish when debugging tests. I picked "April 1st 13:14:15" in the hope that it might stand out. - it's slightly before the test_tick default. Some tests create some commits before the first call to test_tick, so using an older timestamps for those makes sense chronologically. Note that this isn't how things currently work (where system times are usually more recent than test_tick), but that also allows us to flush out a few hidden timestamp dependencies (like the one recently fixed in t5539). - we could likewise pick any timezone we want. Choosing +0000 would have required fixing up fewer tests, but we're more likely to turn up interesting cases by not matching $TZ exactly. And since test_tick already checks "-0700", let's try something in the "+" zone range for variety. It's possible that the non-deterministic times could help flush out bugs (e.g., if something broke when the clock flipped over to 2021, our test suite would let us know). But historically that hasn't been the case; all time-dependent outcomes we've seen turned out to be accidentally flaky tests (which we fixed by using test_tick). If we do want to cover handling the current time, we should dedicate one script to doing so, and have it unset GIT_COMMITTER_DATE explicitly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-15t9100: explicitly unset GIT_COMMITTER_DATEJeff King
The early part of t9100 creates an unusual "doubled" history in the "git-svn" ref. When we get to t9100.17, it looks like this: $ git log --oneline --graph git-svn [...] * efd0303 detect node change from file to directory #2 |\ * | 3e727c0 detect node change from file to directory #2 |/ * 3b00468 try a deep --rmdir with a commit |\ * | b4832d8 try a deep --rmdir with a commit |/ * f0d7bd5 import for git svn Each commit we make with "git commit" is paired with one from "git svn set-tree", with the latter as a merge of the first and its grandparent. Later, t9100.17 wants to check that "git svn fetch" gets the same trees. And it does, but just one copy of each. So it uses rev-list to get the tree of each commit and pipes it to "uniq" to drop the duplicates. Our input isn't sorted, but it will find adjacent duplicates. This works reliably because the order of commits from rev-list always shows the duplicates next to each other. For any one of those merges, we could choose to show its duplicate or the grandparent first. But barring clocks running backwards, the duplicate will always have a time equal to or greater than the grandparent. Even if equal, we break ties by showing the first-parent first, so the duplicates remain adjacent. But this would break if the timestamps stopped moving in chronological order. Normally we would rely on test_tick for this, but we have _two_ sources of time here: - "git commit" creates one commit based on GIT_COMMITTER_DATE (which respects test_tick) - the "svn set-tree" one is based on subversion, which does not have an easy way to specify a timestamp So using test_tick actually breaks the test, because now the duplicates are far in the past, and we'll show the grandparent before the duplicate. And likewise, a proposed change to set GIT_COMMITTER_DATE in all scripts will break it. We _could_ fix this by sorting before removing duplicates, but presumably it's a useful part of the test to make sure the trees appear in the same order in both spots. Likewise, we could use something like: perl -ne 'print unless $seen{$_}++' to remove duplicates without impacting the order. But that doesn't work either, because there are actually multiple (non-duplicate) commits with the same trees (we change a file mode and then change it back). So we'd actually have to de-duplicate the combination of subject and tree. Which then further throws off t9100.18, which compares the tree hashes exactly; we'd have to strip the result back down. Since this test _isn't_ buggy, the simplest thing is to just work around the proposed change by documenting our expectation that git-created commits are correctly interleaved using the current time. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10t3432: use git-reflog to inspect the reflog for HEADHan-Wen Nienhuys
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10t5539: make timestamp requirements more explicitJeff King
The test for "no shallow lines after receiving ACK ready" is very sensitive to the timestamps of the commits we create. It's looking for the fetch negotiation to send a "ready", which in turn depends on the order in which we traverse commits during the negotiation. It works reliably now because the base commit "7" is created without test_commit, and thus gets a commit time matching the current system clock. Whereas the new commits created in this test do use test_commit, and get the usual test_tick time from 2005. So the fetch into the "clone" repository results in a commit graph like this (I omitted some of the "unrelated" commits for clarity; they're all just a sequence of test_ticks): $ git log --graph --format='%ct %s %d' * 1112912953 new (origin/master, origin/HEAD) * 1594322236 7 (grafted, master) * 1112912893 unrelated15 (origin/unrelated15, unrelated15) [...] * 1112912053 unrelated1 (origin/unrelated1, unrelated1) * 1112911993 new-too (HEAD -> newnew, tag: new-too) The important things to see are: - "7" is way in the future compared to the other commits - "new-too" in the fetching repo is older than "new" (and its "unrelated" ancestors) in the shallow repo If we change our "setup shallow clone" step to use test_tick, too (and get rid of the dependency on the system clock), then the test will fail. The resulting graph looks like this: $ git log --graph --format='%ct %s %d' * 1112913373 new (origin/master, origin/HEAD) * 1112912353 7 (grafted, master) * 1112913313 unrelated15 (origin/unrelated15, unrelated15) [...] * 1112912473 unrelated1 (origin/unrelated1, unrelated1) * 1112912413 new-too (HEAD -> newnew, tag: new-too) Our "new-too" is still older than "new" and "unrelated", but now "7" is older than all of them (because it advanced test_tick, which the other tests built on top of). In the original, we advertised "7" as the first "have" before anything else, but now "new-too" is more recent. You'd see the same thing in the unlikely event that the system clock was set before our test_tick default in 2005. Let's make the timing requirements more explicit. The important thing is that the client advertise all of its shared commits first, before presenting its unique "new-too" commit. We can do that and get rid of the system clock dependency at the same time by creating all of the shared commits around time X (using test_tick), and then creating "new-too" with some time long before X. The resulting graph looks like this: $ git log --graph --format='%ct %s %d' * 1500001380 new (origin/master, origin/HEAD) * 1500000420 7 (grafted, master) * 1500001320 unrelated15 (origin/unrelated15, unrelated15) [...] * 1500000480 unrelated1 (origin/unrelated1, unrelated1) * 1400000060 new-too (HEAD -> newnew, tag: new-too) That also lets us get rid of the hacky test_tick added by f0e802ca20 (t5539: update a flaky test, 2014-07-14). That was clearly dancing around the same problem, but only addressed the relationship between commits created in the two subshells (which did use test_tick, but overlapped because increments of test_tick in subshells are lost). Now that we're using consistent and well-placed times for both lines of history, we don't have to care about a one-tick difference between the two sides. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10t9700: loosen ident timezone regexJeff King
A few of the perl tests in t9700 ask for the author and committer ident, and then make sure we get something sensible. For the timestamp portion, we just match [0-9]+, because the actual value will depend on when the test is run. However, we do require that the timezone be "+0000". This works reliably because we set $TZ in test-lib.sh. But in preparation for changing the default timezone, let's be a bit more flexible. We don't actually care about the exact value here, just that we were able to get a sensible output from the perl module's access methods. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10git clone: don't clone into non-empty directoryBen Wijen
When using git clone with --separate-git-dir realgitdir and realgitdir already exists, it's content is destroyed. So, make sure we don't clone into an existing non-empty directory. When d45420c1 (clone: do not clean up directories we didn't create, 2018-01-02) tightened the clean-up procedure after a failed cloning into an empty directory, it assumed that the existing directory given is an empty one so it is OK to keep that directory, while running the clean-up procedure that is designed to remove everything in it (since there won't be any, anyway). Check and make sure that the $GIT_DIR is empty even cloning into an existing repository. Signed-off-by: Ben Wijen <ben@wijen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10Merge branch 'tb/fix-persistent-shallow' into masterJunio C Hamano
When "fetch.writeCommitGraph" configuration is set in a shallow repository and a fetch moves the shallow boundary, we wrote out broken commit-graph files that do not match the reality, which has been corrected. * tb/fix-persistent-shallow: commit.c: don't persist substituted parents when unshallowing
2020-07-10Merge branch 'rs/line-log-until' into masterJunio C Hamano
"git log -Lx,y:path --before=date" lost track of where the range should be because it didn't take the changes made by the youngest commits that are omitted from the output into account. * rs/line-log-until: revision: disable min_age optimization with line-log
2020-07-10Merge branch 'ra/send-email-in-reply-to-from-command-line-wins' into masterJunio C Hamano
"git send-email --in-reply-to=<msg>" did not use the In-Reply-To: header with the value given from the command line, and let it be overridden by the value on In-Reply-To: header in the messages being sent out (if exists). * ra/send-email-in-reply-to-from-command-line-wins: send-email: restore --in-reply-to superseding behavior
2020-07-09commit.c: don't persist substituted parents when unshallowingTaylor Blau
Since 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file', 2020-04-22), Git knows how to reset stat-validity checks for the $GIT_DIR/shallow file, allowing it to change between a shallow and non-shallow state in the same process (e.g., in the case of 'git fetch --unshallow'). However, when $GIT_DIR/shallow changes, Git does not alter or remove any grafts (nor substituted parents) in memory. This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph set to true. Ordinarily in a shallow repository (and before 37b9dcabfc, even in this case), commit_graph_compatible() would return false, indicating that the repository should not be used to write a commit-graphs (since commit-graph files cannot represent a shallow history). But since 37b9dcabfc, in an --unshallow operation that check succeeds. Thus even though the repository isn't shallow any longer (that is, we have all of the objects), the in-core representation of those objects still has munged parents at the shallow boundaries. When the commit-graph write proceeds, we use the incorrect parentage, producing wrong results. There are two ways for a user to work around this: either (1) set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph after unshallowing. One way to fix this would be to reset the parsed object pool entirely (flushing the cache and thus preventing subsequent reads from modifying their parents) after unshallowing. That would produce a problem when callers have a now-stale reference to the old pool, and so this patch implements a different approach. Instead, attach a new bit to the pool, 'substituted_parent', which indicates if the repository *ever* stored a commit which had its parents modified (i.e., the shallow boundary prior to unshallowing). This bit needs to be sticky because all reads subsequent to modifying a commit's parents are unreliable when unshallowing. Modify the check in 'commit_graph_compatible' to take this bit into account, and correctly avoid generating commit-graphs in this case, thus solving the bug. Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Reported-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-08t6000: use test_tick consistentlyJeff King
The first two commits created in t6000 are done without test_tick, meaning they use the current system clock. After that, we create one with test_tick, which means it uses a deterministic time in the past. The result of the "symleft flag bit is propagated down from tag" test relies on the output order of commits from git-log, which in turn depends on these timestamps. So this test is technically dependent on the system clock time, though in practice it would only matter if your system clock was set before test_tick's default time (which is in 2005). However, let's use test_tick consistently for those early commits (and update the expected output to match). This makes the test deterministic, which is in turn easier to reason about and debug. Note that there's also a fourth commit here, and it does not use test_tick. It does have a deterministic timestamp because of the prior use of test_tick in the script, but it will always be the same time as the third commit. Let's use test_tick here, too, for consistency. The matching timestamps between the third and fourth commit are not an important part of the test. We could also use test_commit in all of these cases, as it runs test_tick under the hood. But it would be awkward to do so, as these tests diverge from the usual test_commit patterns (e.g., by creating multiple files in a single commit). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-07Merge branch 'dl/test-must-fail-fixes-5'Junio C Hamano
The effort to avoid using test_must_fail on non-git command continues. * dl/test-must-fail-fixes-5: lib-submodule-update: pass 'test_must_fail' as an argument lib-submodule-update: prepend "git" to $command lib-submodule-update: consolidate --recurse-submodules lib-submodule-update: add space after function name
2020-07-07Merge branch 'jk/fast-export-anonym-alt'Junio C Hamano
"git fast-export --anonymize" learned to take customized mapping to allow its users to tweak its output more usable for debugging. * jk/fast-export-anonym-alt: fast-export: use local array to store anonymized oid fast-export: anonymize "master" refname fast-export: allow seeding the anonymized mapping fast-export: add a "data" callback parameter to anonymize_str() fast-export: move global "idents" anonymize hashmap into function fast-export: use a flex array to store anonymized entries fast-export: stop storing lengths in anonymized hashmaps fast-export: tighten anonymize_mem() interface to handle only strings fast-export: store anonymized oids as hex strings fast-export: use xmemdupz() for anonymizing oids t9351: derive anonymized tree checks from original repo
2020-07-07Merge branch 'js/diff-files-i-t-a-fix-for-difftool'Junio C Hamano
"git difftool" has trouble dealing with paths added to the index with the intent-to-add bit. * js/diff-files-i-t-a-fix-for-difftool: difftool -d: ensure that intent-to-add files are handled correctly diff-files --raw: show correct post-image of intent-to-add files
2020-07-07Merge branch 'js/default-branch-name'Junio C Hamano
The name of the primary branch in existing repositories, and the default name used for the first branch in newly created repositories, is made configurable, so that we can eventually wean ourselves off of the hardcoded 'master'. * js/default-branch-name: contrib: subtree: adjust test to change in fmt-merge-msg testsvn: respect `init.defaultBranch` remote: use the configured default branch name when appropriate clone: use configured default branch name when appropriate init: allow setting the default for the initial branch name via the config init: allow specifying the initial branch name for the new repository docs: add missing diamond brackets submodule: fall back to remote's HEAD for missing remote.<name>.branch send-pack/transport-helper: avoid mentioning a particular branch fmt-merge-msg: stop treating `master` specially
2020-07-07Merge branch 'bc/http-push-flagsfix'Junio C Hamano
The code to push changes over "dumb" HTTP had a bad interaction with the commit reachability code due to incorrect allocation of object flag bits, which has been corrected. * bc/http-push-flagsfix: http-push: ensure unforced pushes fail when data would be lost
2020-07-07Merge branch 'js/pu-to-seen'Junio C Hamano
The documentation and some tests have been adjusted for the recent renaming of "pu" branch to "seen". * js/pu-to-seen: tests: reference `seen` wherever `pu` was referenced docs: adjust the technical overview for the rename `pu` -> `seen` docs: adjust for the recent rename of `pu` to `seen`
2020-07-07Merge branch 'cb/is-descendant-of'Junio C Hamano
Code clean-up. * cb/is-descendant-of: commit-reach: avoid is_descendant_of() shim
2020-07-07Merge branch 'es/get-worktrees-unsort'Junio C Hamano
API cleanup for get_worktrees() * es/get-worktrees-unsort: worktree: drop get_worktrees() unused 'flags' argument worktree: drop get_worktrees() special-purpose sorting option
2020-07-07Merge branch 'bc/sha-256-cvs-svn-updates'Junio C Hamano
CVS/SVN interface have been prepared for SHA-256 transition * bc/sha-256-cvs-svn-updates: git-cvsexportcommit: port to SHA-256 git-cvsimport: port to SHA-256 git-cvsserver: port to SHA-256 git-svn: set the OID length based on hash algorithm perl: make SVN code hash independent perl: make Git::IndexInfo work with SHA-256 perl: create and switch variables for hash constants t/lib-git-svn: make hash size independent t9101: make hash independent t9104: make hash size independent t9100: make test work with SHA-256 t9108: make test hash independent t9168: make test hash independent t9109: make test hash independent
2020-07-07Merge branch 'ak/commit-graph-to-slab'Junio C Hamano
A few fields in "struct commit" that do not have to always be present have been moved to commit slabs. * ak/commit-graph-to-slab: commit-graph: minimize commit_graph_data_slab access commit: move members graph_pos, generation to a slab commit-graph: introduce commit_graph_data_slab object: drop parsed_object_pool->commit_count
2020-07-07Merge branch 'ps/ref-transaction-hook'Junio C Hamano
A new hook. * ps/ref-transaction-hook: refs: implement reference transaction hook
2020-07-07Merge branch 'bc/sha-256-part-2'Junio C Hamano
SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...
2020-07-07lib-t6000.sh: write tag using git-update-refHan-Wen Nienhuys
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-07revision: disable min_age optimization with line-logRené Scharfe
If one of the options --before, --min-age or --until is given, limit_list() filters out younger commits early on. Line-log needs all those commits to trace the movement of line ranges, though. Skip this optimization if both are used together. Reported-by: Мария Долгополова <dolgopolovamariia@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-02difftool -d: ensure that intent-to-add files are handled correctlyJohannes Schindelin
In https://github.com/git-for-windows/git/issues/2677, a `git difftool -d` problem was reported. The underlying cause was a bug in `git diff-files --raw` that we just fixed: it reported intent-to-add files with the empty _tree_ as the post-image OID, when we need to show an all-zero (or, "null") OID instead, to indicate to the caller that they have to look at the worktree file. The symptom of that problem shown by `git difftool` was this: error: unable to read sha1 file of <path> (<empty-tree-OID>) error: could not write '<filename>' Make sure that the reported `difftool` problem stays fixed. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-02diff-files --raw: show correct post-image of intent-to-add filesJohannes Schindelin
The documented behavior of `git diff-files --raw` is to display [...] 0{40} if creation, unmerged or "look at work tree". on the right hand (i.e. postimage) side. This happens for files that have unstaged modifications, and for files that are unmodified but stat-dirty. For intent-to-add files, we used to show the empty blob's hash instead. In c26022ea8f5 (diff: convert diff_addremove to struct object_id, 2017-05-30), we made that worse by inadvertently changing that to the hash of the empty tree. Let's make the behavior consistent with files that have unstaged modifications (which applies to intent-to-add files, too) by showing all-zero values also for intent-to-add files. Accordingly, this patch adjusts the expectations set by the regression test introduced in feea6946a5b (diff-files: treat "i-t-a" files as "not-in-index", 2020-06-20). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-02send-email: restore --in-reply-to superseding behaviorRafael Aquini
git send-email --in-reply-to= fails to override In-Reply-To email headers, if they're present in the output of format-patch, even when explicitly told to do so by the option --no-thread, which breaks the contract of the command line switch option, per its man page. " --in-reply-to=<identifier> Make the first mail (or all the mails with --no-thread) appear as a reply to the given Message-Id, which avoids breaking threads to provide a new patch series. " This patch fixes the aformentioned issue, by bringing --in-reply-to's old overriding behavior back. The test was donated by Carlo Marcelo Arenas Belón. Signed-off-by: Rafael Aquini <aquini@redhat.com> Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-02commit-graph: check all leading directories in changed path Bloom filtersSZEDER Gábor
The file 'dir/subdir/file' can only be modified if its leading directories 'dir' and 'dir/subdir' are modified as well. So when checking modified path Bloom filters looking for commits modifying a path with multiple path components, then check not only the full path in the Bloom filters, but all its leading directories as well. Take care to check these paths in "deepest first" order, because it's the full path that is least likely to be modified, and the Bloom filter queries can short circuit sooner. This can significantly reduce the average false positive rate, by about an order of magnitude or three(!), and can further speed up pathspec-limited revision walks. The table below compares the average false positive rate and runtime of git rev-list HEAD -- "$path" before and after this change for 5000+ randomly* selected paths from each repository: Average false Average Average positive rate runtime runtime before after before after difference ------------------------------------------------------------------ git 3.220% 0.7853% 0.0558s 0.0387s -30.6% linux 2.453% 0.0296% 0.1046s 0.0766s -26.8% tensorflow 2.536% 0.6977% 0.0594s 0.0420s -29.2% *Path selection was done with the following pipeline: git ls-tree -r --name-only HEAD | sort -R | head -n 5000 The improvements in runtime are much smaller than the improvements in average false positive rate, as we are clearly reaching diminishing returns here. However, all these timings depend on that accessing tree objects is reasonably fast (warm caches). If we had a partial clone and the tree objects had to be fetched from a promisor remote, e.g.: $ git clone --filter=tree:0 --bare file://.../webkit.git webkit.notrees.git $ git -C webkit.git -c core.modifiedPathBloomFilters=1 \ commit-graph write --reachable $ cp webkit.git/objects/info/commit-graph webkit.notrees.git/objects/info/ $ git -C webkit.notrees.git -c core.modifiedPathBloomFilters=1 \ rev-list HEAD -- "$path" then checking all leading path component can reduce the runtime from over an hour to a few seconds (and this is with the clone and the promisor on the same machine). This adjusts the tracing values in t4216-log-bloom.sh, which provides a concrete way to notice the improvement. Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-02revision: empty pathspecs should not use Bloom filtersTaylor Blau
The prepare_to_use_bloom_filter() method was not intended to be called on an empty pathspec. However, 'git log -- .' and 'git log' are subtly different: the latter reports all commits while the former will simplify commits that do not change the root tree. This means that the path used to construct the bloom_key might be empty, and that value is not added to the Bloom filter during construction. That means that the results are likely incorrect! To resolve the issue, be careful about the length of the path and stop filling Bloom filters. To be completely sure we do not use them, drop the pointer to the bloom_filter_settings from the commit-graph. That allows our test to look at the trace2 logs to verify no Bloom filter statistics are reported. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-02commit-graph: persist existence of changed-pathsDerrick Stolee
The changed-path Bloom filters were released in v2.27.0, but have a significant drawback. A user can opt-in to writing the changed-path filters using the "--changed-paths" option to "git commit-graph write" but the next write will drop the filters unless that option is specified. This becomes even more important when considering the interaction with gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of features.experimental). These config options trigger commit-graph writes that the user did not signal, and hence there is no --changed-paths option available. Allow a user that opts-in to the changed-path filters to persist the property of "my commit-graph has changed-path filters" automatically. A user can drop filters using the --no-changed-paths option. In the process, we need to be extremely careful to match the Bloom filter settings as specified by the commit-graph. This will allow future versions of Git to customize these settings, and the version with this change will persist those settings as commit-graphs are rewritten on top. Use the trace2 API to signal the settings used during the write, and check that output in a test after manually adjusting the correct bytes in the commit-graph file. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-02bloom: fix logic in get_bloom_filter()Derrick Stolee
The get_bloom_filter() method is a bit complicated in some parts where it does not need to be. In particular, it needs to return a NULL filter only when compute_if_not_present is zero AND the filter data cannot be loaded from a commit-graph file. This currently happens by accident because the commit-graph does not load changed-path Bloom filters from an existing commit-graph when writing a new one. This will change in a later patch. Also clean up some style issues while we are here. One side-effect of returning a NULL filter is that the filters that are reported as "too large" will now be reported as NULL insead of length zero. This case was not properly covered before, so add a test. Further, remote the counting of the zero-length filters from revision.c and the trace2 logs. Helped-by: René Scharfe <l.s.r@web.de> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-30Merge branch 'sk/diff-files-show-i-t-a-as-new'Junio C Hamano
"git diff-files" has been taught to say paths that are marked as intent-to-add are new files, not modified from an empty blob. * sk/diff-files-show-i-t-a-as-new: diff-files: treat "i-t-a" files as "not-in-index"
2020-06-30Merge branch 'xl/upgrade-repo-format'Junio C Hamano
Allow runtime upgrade of the repository format version, which needs to be done carefully. There is a rather unpleasant backward compatibility worry with the last step of this series, but it is the right thing to do in the longer term. * xl/upgrade-repo-format: check_repository_format_gently(): refuse extensions for old repositories sparse-checkout: upgrade repository to version 1 when enabling extension fetch: allow adding a filter after initial clone repository: add a helper function to perform repository format upgrade
2020-06-26fast-export: anonymize "master" refnameJeff King
Running "fast-export --anonymize" will leave "refs/heads/master" untouched in the output, for two reasons: - it helped to have some known reference point between the original and anonymized repository - since it's historically the default branch name, it doesn't leak any information Now that we can ask fast-export to retain particular tokens, we have a much better tool for the first one (because it works for any ref, not just master). For the second, the notion of "default branch name" is likely to become configurable soon, at which point the name _does_ leak information. Let's drop this special case in preparation. Note that we have to adjust the test a bit, since it relied on using the name "master" in the anonymized repos. We could just use --anonymize-map=master to keep the same output, but then we wouldn't know if it works because of our hard-coded master or because of the explicit map. So let's flip the test a bit, and confirm that we anonymize "master", but keep "other" in the output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-26fast-export: allow seeding the anonymized mappingJeff King
After you anonymize a repository, it can be hard to find which commits correspond between the original and the result, and thus hard to reproduce commands that triggered bugs in the original. Let's make it possible to seed the anonymization map. This lets users either: - mark names to be retained as-is, if they don't consider them secret (in which case their original commands would just work) - map names to new values, which lets them adapt the reproduction recipe to the new names without revealing the originals The implementation is fairly straight-forward. We already store each anonymized token in a hashmap (so that the same token appearing twice is converted to the same result). We can just introduce a new "seed" hashmap which is consulted first. This does make a few more promises to the user about how we'll anonymize things (e.g., token-splitting pathnames). But it's unlikely that we'd want to change those rules, even if the actual anonymization of a single token changes. And it makes things much easier for the user, who can unblind only a directory name without having to specify each path within it. One alternative to this approach would be to anonymize as we see fit, and then dump the whole refname and pathname mappings to a file. This does work, but it's a bit awkward to use (you have to manually dig the items you care about out of the mapping). Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25Merge branch 'pb/t4014-unslave'Junio C Hamano
A branch name used in a test has been clarified to match what is going on. * pb/t4014-unslave: t4014: do not use "slave branch" nomenclature