Age | Commit message (Collapse) | Author |
|
Among four examples, only this one used "double quoted" sample
patterns, but all others marked up the patterns in `monospace`.
Signed-off-by: Johan Ruokangas <johan@latehours.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We return the oid that matched, but the sole caller only cares whether
we matched anything at all. This is mostly academic, since there's only
one caller, but the lifetime of the returned pointer is not immediately
clear. Sometimes it points to an oid in a tag struct, which should live
forever. And sometimes to the oid passed in, which only lives as long as
the each_ref_fn callback we're called from.
Simplify this to a boolean return which is more direct and obvious. As a
bonus, this lets us avoid the weird pattern of overwriting our "oid"
parameter in the loop (since we now only refer to the tagged oid one
time, and can just inline the call to get it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When handling --points-at, we have to try to peel each ref to see if
it's a tag that points at a requested oid. We start this process by
calling parse_object() on the oid pointed to by each ref.
The cost of parsing each object adds up, especially in an output that
doesn't otherwise need to open the objects at all. Ideally we'd use
peel_iterated_oid() here, which uses the cached information in the
packed-refs file. But we can't, because our --points-at must match not
only the fully peeled value, but any interim values (so if tag A points
to tag B which points to commit C, we should match --points-at=B, but
peel_iterated_oid() will only tell us about C).
So the best we can do (absent changes to the packed-refs peel traits) is
to avoid parsing non-tags. The obvious way to do that is to call
oid_object_info() to check the type before parsing. But there are a few
gotchas there, like checking if the object has already been parsed.
Instead we can just tell parse_object() that we are OK skipping the hash
check, which lets it turn on several optimizations. Commits can be
loaded via the commit graph (so it's both fast and we have the benefit
of the parsed data if we need it later at the output stage). Blobs are
not loaded at all. Trees are still loaded, but it's rather rare to have
a ref point directly to a tree (and since this is just an optimization,
kicking in 99% of the time is OK).
Even though we're paying for an extra lookup, the cost to avoid parsing
the non-tags is a net benefit. In my git.git repository with 941 tags
and 1440 other refs pointing to commits, this significantly cuts the
runtime:
Benchmark 1: ./git.old for-each-ref --points-at=HEAD
Time (mean ± σ): 26.8 ms ± 0.5 ms [User: 24.5 ms, System: 2.2 ms]
Range (min … max): 25.9 ms … 29.2 ms 107 runs
Benchmark 2: ./git.new for-each-ref --points-at=HEAD
Time (mean ± σ): 9.1 ms ± 0.3 ms [User: 6.8 ms, System: 2.2 ms]
Range (min … max): 8.6 ms … 10.2 ms 308 runs
Summary
'./git.new for-each-ref --points-at=HEAD' ran
2.96 ± 0.10 times faster than './git.old for-each-ref --points-at=HEAD'
In a repository that is mostly annotated tags, we'd expect less
improvement (we might still skip a few object loads, but that's balanced
by the extra lookups). In my clone of linux.git, which has 782 tags and
3 branches, the run-time is about the same (it's actually ~1% faster on
average after this patch, but that's within the run-to-run noise).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we peel tags to check if they match a --points-at oid, we
recursively parse the tagged object to see if it is also a tag. But
since the tag itself tells us the type of the object it points to (and
even gives us the appropriate object struct via its "tagged" member), we
can use that directly.
We do still have to make sure to call parse_tag() before looking at each
tag. This is redundant for the outermost tag (since we did call
parse_object() to find its type), but that's OK; parse_tag() is smart
enough to make this a noop when the tag has already been parsed.
In my clone of linux.git, with 782 tags (and only 3 non-tags), this
yields a significant speedup (bringing us back where we were before the
commit before this one started recursively dereferencing tags):
Benchmark 1: ./git.old for-each-ref --points-at=HEAD --format="%(refname)"
Time (mean ± σ): 20.3 ms ± 0.5 ms [User: 11.1 ms, System: 9.1 ms]
Range (min … max): 19.6 ms … 21.5 ms 141 runs
Benchmark 2: ./git.new for-each-ref --points-at=HEAD --format="%(refname)"
Time (mean ± σ): 11.4 ms ± 0.2 ms [User: 6.3 ms, System: 5.0 ms]
Range (min … max): 11.0 ms … 12.2 ms 250 runs
Summary
'./git.new for-each-ref --points-at=HEAD --format="%(refname)"' ran
1.79 ± 0.05 times faster than './git.old for-each-ref --points-at=HEAD --format="%(refname)"'
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Tags are dereferenced until reaching a different object type to handle
nested tags, e.g. on checkout. In contrast, "git tag --points-at=..."
fails to list such nested tags because only one level of indirection is
obtained in filter_refs(). Implement the recursive dereferencing for the
"--points-at" option when filtering refs to unify the behaviour.
Signed-off-by: Jan Klötzke <jan@kloetzke.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git ls-files '(attr:X)D/'" that triggers the common prefix
optimization codepath failed to read from "D/.gitattributes",
which has been corrected.
* jc/pathspec-match-with-common-prefix:
dir: match "attr" pathspec magic with correct paths
t6135: attr magic with path pattern
|
|
Further shuffling of declarations across header files to streamline
file dependencies.
* cw/compat-util-header-cleanup:
git-compat-util: move alloc macros to git-compat-util.h
treewide: remove unnecessary includes for wrapper.h
kwset: move translation table from ctype
sane-ctype.h: create header for sane-ctype macros
git-compat-util: move wrapper.c funcs to its header
git-compat-util: move strbuf.c funcs to its header
|
|
Code snippets in a tutorial document no longer compiled after
recent header shuffling, which have been corrected.
* vd/adjust-mfow-doc-to-updated-headers:
docs: add necessary headers to Documentation/MFOW.txt
|
|
Code simplification.
* rs/ls-tree-prefix-simplify:
ls-tree: simplify prefix handling
|
|
Code simplification.
* rs/userformat-find-requirements-simplify:
pretty: use strchr(3) in userformat_find_requirements()
|
|
Code clarification.
* rs/pretty-format-double-negation-fix:
pretty: avoid double negative in format_commit_item()
|
|
Code simplification.
* rs/packet-length-simplify:
pkt-line: add size parameter to packet_length()
|
|
"git diff --no-index" learned to read from named pipes as if they
were regular files, to allow "git diff <(process) <(substitution)"
some shells support.
* pw/diff-no-index-from-named-pipes:
diff --no-index: support reading from named pipes
t4054: test diff --no-index with stdin
diff --no-index: die on error reading stdin
diff --no-index: refuse to compare stdin to a directory
|
|
Use the now common skip_prefix() cascade instead of a case statement to
parse the strftime(3) format in strbuf_addftime(). skip_prefix() parses
the "fmt" pointer and advances it appropriately, making additional
pointer arithmetic unnecessary. The resulting code is more compact and
consistent with most other strbuf_expand_step() loops.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a test introduced by 26c9c03f0a (ref-filter: add new "signature"
atom, 2023-06-04) the file named "file" is added by a setup step that
requires GPG and modified by a second setup step that requires GPGSSH.
Systems lacking the first prerequisite skip the initial setup step and
then "git commit -a" in the second one doesn't find the modified file.
Add it explicitly.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"imap-send" codepaths got cleaned up to get rid of unused
parameters.
* jk/imap-send-unused-variable-cleanup:
imap-send: drop unused fields from imap_cmd_cb
imap-send: drop unused parameter from imap_cmd_cb callback
imap-send: use server conf argument in setup_curl()
|
|
"git bugreport" tests did not test what it wanted to test, which
has been corrected.
* ma/t0091-fixup:
t0091-bugreport.sh: actually verify some content of report
|
|
The "git for-each-ref" family of commands learned placeholders
related to GPG signature verification.
* ks/ref-filter-signature:
ref-filter: add new "signature" atom
t/lib-gpg: introduce new prereq GPG2
|
|
I noticed this test was producing output like
```
t4002-diff-basic.sh: test_expect_successdiff can read from stdin: not found
```
which is rather odd. Investigation shows an error of shell syntax:
foo'abc' is the same as fooabc to the shell. Perhaps obviously, this is
not a valid command for the test.
I am surprised this doesn't count as an error in the test, but that
accounts for it going unnoticed.
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, ensure that we don't overflow
when trying to read an OID out of an existing commit-graph during
verification.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, ensure that we don't overflow
when trying to read an existing OID while writing a new commit-graph.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When merging two commit graphs, ensure that we don't attempt to merge
two graphs which, when combined, have more total commits than the 32-bit
unsigned maximum.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, ensure that we don't overflow
when choosing how to split and merge different layers of the
commit-graph.
In particular, avoid a potential overflow between `size_mult` and
`num_commits`, as well as a potential overflow between the number of
commits currently in the merged graph, and the number of commits in the
graph about to be merged.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, ensure that we don't overflow
when computing an offset into the commit_data chunk when the (relative)
graph position exceeds 2^32-1/GRAPH_DATA_WIDTH.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, ensure that we don't overflow
when the lex_index of the commit we are trying to fill out exceeds
2^32-1/(g->hash_len+16).
The other hunk touched in this patch is not susceptible to overflow,
since an explicit cast is made to a 64-bit unsigned value. For clarity
and consistency with the rest of the commits in this series, avoid a
tricky to reason about cast, and use `st_mult()` directly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, ensure that we don't overflow
in a few spots within `fill_commit_graph_info()`:
- First, when computing an offset into the commit data chunk, which
can occur when the `lex_index` of the item we're looking up exceeds
2^32-1/GRAPH_DATA_WIDTH.
- A similar issue when computing the generation date offset for
commits with `lex_index` greater than 2^32-1/4. Note that in
practice this will never overflow, since the left-hand operand is
from calling `sizeof(...)` and is thus already a `size_t`. But wrap
that in an `st_mult()` to make it clear that we intend to perform
this computation using 64-bit operands.
- Finally, a nearly identical issue as above when computing an offset
into the `generation_data_overflow` chunk.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, ensure that we don't overflow
when trying to compute an offset into the `chunk_oid_lookup` table when
the `lex_index` of the item we're trying to look up exceeds
`2^32-1/g->hash_len`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The commit-graph uses a fanout table with 4-byte entries to store the
number of commits at each shard of the commit-graph. So it is OK to have
a commit graph with as many as 2^32-1 stored commits. But we risk
overflowing any computation which may exceed the 32-bit (unsigned)
maximum when those computations are (incorrectly) performed using 32-bit
operands.
There are a couple of spots in `add_graph_to_chain()` where we could
potentially overflow the result:
- First, when comparing the list of existing entries in the
commit-graph chain. It is unlikely that this should ever overflow,
since it would require having roughly 2^32-1/g->hash_len
commit-graphs in the chain. But let's guard that computation with a
`st_mult()` just to be safe.
- Second, when computing the number of commits in the graph added to
the front of the chain. This value is also a 32-bit unsigned, but we
should make sure that it does not grow beyond the maximum value.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When writing a commit-graph, we use the chunk-format API to write out
each individual chunk of the commit-graph. Each chunk of the
commit-graph is tracked via a call to `add_chunk()`, along with the
expected size of that chunk.
Similar to an earlier commit which handled the identical issue in the
MIDX machinery, guard against overflow when dealing with a commit-graph
with a large number of entries to avoid corrupting the contents of the
commit-graph itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When a bitmap is used to answer some reachability query, it creates a
pseudo-bitmap called the "extended index" on top of any existing bitmaps
to store objects that are relevant to the query, but not mentioned in
the bitmap.
When looking up the ith object in the extended index in a bitmap, it is
common to write something like:
bitmap_get(result, i + bitmap_num_objects(bitmap_git))
, indicating that we want the ith object following all other objects
mentioned in the bitmap_git.
Since the type of `i` and the return type of `bitmap_num_objects()` are
both `uint32_t`s, But if there are either a large number of objects in
the bitmap, or a large number of objects in the extended index (or
both), this addition can overflow when the sum is greater than 2^32-1.
Having that large of a bitmap position is entirely acceptable, but we
need to ensure that the computed bitmap position for that object is
performed using 64-bits and doesn't overflow.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as in previous commits, avoid an integer overflow
when computing the expected size of a MIDX.
(Note that this is also OK as-is, since `p->pack_size` is an `off_t`, so
this computation should already be done as 64-bit integers. But again,
let's use `st_mult()` to make this fact clear).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When writing a MIDX, we use the chunk-format API to write out each
individual chunk of the MIDX. Each chunk of the MIDX is tracked via a
call to `add_chunk()`, along with the expected size of that chunk.
Guard against overflow when dealing with a MIDX with a large number of
entries (and consequently, large chunks within the MIDX file itself) to
avoid corrupting the contents of the MIDX itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the `write_midx_context` structure, we use two `uint32_t`'s to track
the length and allocated size of the packs, and one `uint32_t` to track
the number of objects in the MIDX.
In practice, having these be 32-bit unsigned values shouldn't cause any
problems since we are unlikely to have that many objects or packs in any
real-world repository. But these values should be `size_t`'s, so change
their type to reflect that.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous patches, avoid an overflow when looking
up object offsets in the MIDX's large offset table by guarding the
computation via `st_mult()`.
This instance is also OK as-is, since the left operand is the result of
`sizeof(...)`, which is already a `size_t`. But use `st_mult()` instead
here to make it explicit that this computation is to be performed using
64-bit unsigned integers.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, avoid overflow when looking up
an object's OID in a MIDX when its position is greater than
`2^32-1/m->hash_len`.
As usual, it is perfectly OK for a MIDX to have as many as 2^32-1
objects (since we use 32-bit fields to count the number of objects at
each fanout layer). But if we have more than `2^32-1/m->hash_len` number
of objects, we will incorrectly perform the computation using 32-bit
integers, overflowing the result.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `midx_fanout` struct is used to keep track of a set of OIDs
corresponding to each layer of the MIDX's fanout table. It stores an
array of entries, along with the number of entries in the table, and the
allocated size of the array.
Both `nr` and `alloc` are stored as 32-bit unsigned integers. In
practice, this should never cause any problems, since most packs have
far fewer than 2^32-1 objects.
But storing these as `size_t`'s is more appropriate, and prevents us
from accidentally overflowing some result when multiplying or adding to
either of these values. Update these struct members to be `size_t`'s as
appropriate.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as the previous commits, ensure that we use
`st_add()` or `st_mult()` when computing values that may overflow the
32-bit unsigned limit.
Note that in each of these instances, we prevent 32-bit overflow
already since we have explicit casts to `size_t`.
So this code is OK as-is, but let's clarify it by using the `st_xyz()`
helpers to make it obvious that we are performing the relevant
computations using 64 bits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Prevent an overflow when locating a pack's CRC offset when the number
of packed items is greater than 2^32-1/hashsz by guarding the
computation with an `st_mult()`.
Note that to avoid truncating the result, the `crc_offset` member must
itself become a `size_t`. The only usage of this variable (besides the
assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the
index-pack code. There we use the `crc_offset` as a pointer offset, so
we are already equipped to handle the type change.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Many callback interfaces have an extra void data parameter, but we don't
always need it (especially for dumping functions like the ones in test
helpers). Mark them as unused to avoid -Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We iterate over the set of input tag names using callbacks. But not all
operations need the same inputs, so some parameters go unused (but of
course not the same ones for each operation). Mark the unused ones to
avoid -Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We don't need to use the "data" parameter in this instance. Let's mark
it to avoid -Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We don't look at the "commit" parameter to our callback, as our
"mergetag_data" pointer contains the original name "ref", which we use
instead. But we can't get rid of it, since other for_each_mergetag
callbacks do use it. Let's mark the parameter to avoid
-Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We don't look at the "flags" parameter, which is natural for something
that is just printing the contents of the replace refs. But let's mark
it to appease -Wunused-parameter.
This probably should have been part of 63e14ee2d6 (refs: mark unused
each_ref_fn parameters, 2022-08-19), but I missed it as this one is a
repo_each_ref_fn, which takes an extra repository argument.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Our threeway_callback() does not bother to look at its "n" parameter. It
is static in this file and used only by trivial_merge_trees(), which
always passes 3 trees (hence the name "threeway"). It also does not look
at "dirmask". This is OK, as it handles directories specifically by
looking at the mode bits.
Other traverse_info callbacks need these, so we can't get drop them from
the interface. But let's annotate these ones to avoid complaints from
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are a few callback functions which are used with the fsck code,
but it's natural that not all callbacks need all parameters. For
reporting, even something as obvious as "the oid of the object which had
a problem" is not always used, as some callers are only checking a
single object in the first place. And for both reporting and walking,
things like void data pointers and the fsck_options aren't always
necessary.
But since each such parameter is used by _some_ callback, we have to
keep them in the interface. Mark the unused ones in specific callbacks
to avoid triggering -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The setup_revision_opt struct has a "tweak" function pointer, which can
be used to adjust parameters after setup_revisions() parses arguments,
but before it finalizes setup. In addition to the rev_info struct, the
callback receives a pointer to the setup_revision_opt, as well.
But none of the existing callbacks looks at the extra "opt" parameter,
leading to -Wunused-parameter warnings.
We could mark it as UNUSED, but instead let's remove it entirely. It's
conceivable that it could be useful for a callback to have access to the
"opt" struct. But in the 13 years that this mechanism has existed,
nobody has used it. So let's just drop it in the name of simplifying.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Callbacks to for_each_altodb() get a void data pointer, but we don't
need it here. Mark it as unused to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When parsing the input, we have a "keep_cr" parameter to tell us how to
handle line endings. But this doesn't apply to stgit or hg patches
(which are not mailbox formats where we have to worry about that), so we
ignore the parameter entirely in those functions.
Let's mark these as unused so that -Wunused-parameter does not complain
about them.
Note that we could just drop these parameters entirely. They are
necessary to conform to the mail_conv_fn interface used by
split_mail_conv(), but these two callbacks are the only ones used with
that function. The other formats (which _do_ care about keep_cr) use
split_mail_mbox(). But it's conceivable that we'd eventually add another
format that does care about this option, so let's leave it as part of
the generic interface.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|