Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-03pack-objects: allow `--filter` without `--stdout`Christian Couder
9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21) taught `git pack-objects` to use `--filter`, but required the use of `--stdout` since a partial clone mechanism was not yet in place to handle missing objects. Since then, changes like 9e27beaa23 (promisor-remote: implement promisor_remote_get_direct(), 2019-06-25) and others added support to dynamically fetch objects that were missing. Even without a promisor remote, filtering out objects can also be useful if we can put the filtered out objects in a separate pack, and in this case it also makes sense for pack-objects to write the packfile directly to an actual file rather than on stdout. Remove the `--stdout` requirement when using `--filter`, so that in a follow-up commit, repack can pass `--filter` to pack-objects to omit certain objects from the resulting packfile. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13Merge branch 'jk/unused-post-2.42-part2'Junio C Hamano
Unused parameters to functions are marked as such, and/or removed, in order to bring us closer to -Wunused-parameter clean. * jk/unused-post-2.42-part2: parse-options: mark unused parameters in noop callback interpret-trailers: mark unused "unset" parameters in option callbacks parse-options: add more BUG_ON() annotations merge: do not pass unused opt->value parameter parse-options: mark unused "opt" parameter in callbacks parse-options: prefer opt->value to globals in callbacks checkout-index: delay automatic setting of to_tempfile format-patch: use OPT_STRING_LIST for to/cc options merge: simplify parsing of "-n" option merge: make xopts a strvec
2023-09-06parse-options: mark unused "opt" parameter in callbacksJeff King
The previous commit argued that parse-options callbacks should try to use opt->value rather than touching globals directly. In some cases, however, that's awkward to do. Some callbacks touch multiple variables, or may even just call into an abstracted function that does so. In some of these cases we _could_ convert them by stuffing the multiple variables into a single struct and passing the struct pointer through opt->value. But that may make other parts of the code less readable, as the struct relationship has to be mentioned everywhere. Let's just accept that these cases are special and leave them as-is. But we do need to mark their "opt" parameters to satisfy -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-06parse-options: prefer opt->value to globals in callbacksJeff King
We have several parse-options callbacks that ignore their "opt" parameters entirely. This is a little unusual, as we'd normally put the result of the parsing into opt->value. In the case of these callbacks, though, they directly manipulate global variables instead (and in most cases the caller sets opt->value to NULL in the OPT_CALLBACK declaration). The immediate symptom we'd like to deal with is that the unused "opt" variables trigger -Wunused-parameter. But how to fix that is debatable. One option is to annotate them with UNUSED. But another is to have the caller pass in the appropriate variable via opt->value, and use it. That has the benefit of making the callbacks reusable (in theory at least), and makes it clear from the OPT_CALLBACK declaration which variables will be affected (doubly so for the cases in builtin/fast-export.c, where we do set opt->value, but it is completely ignored!). The slight downside is that we lose type safety, since they're now passing through void pointers. I went with the "just use them" approach here. The loss of type safety is unfortunate, but that is already an issue with most of the other callbacks. If we want to try to address that, we should do so more consistently (and this patch would prepare these callbacks for whatever we choose to do there). Note that in the cases in builtin/fast-export.c, we are passing anonymous enums. We'll have to give them names so that we can declare the appropriate pointer type within the callbacks. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29builtin/pack-objects.c: support `--max-pack-size` with `--cruft`Taylor Blau
When pack-objects learned the `--cruft` option back in b757353676 (builtin/pack-objects.c: --cruft without expiration, 2022-05-20), we explicitly forbade `--cruft` with `--max-pack-size`. At the time, there was no specific rationale given in the patch for not supporting the `--max-pack-size` option with `--cruft`. (As best I can remember, it's because we were trying to push users towards only ever having a single cruft pack, but I cannot be sure). However, `--max-pack-size` is flexible enough that it already works with `--cruft` and can shard unreachable objects across multiple cruft packs, creating separate ".mtimes" files as appropriate. In fact, the `--max-pack-size` option worked with `--cruft` as far back as b757353676! This is because we overwrite the `written_list`, and pass down the appropriate length, i.e. the number of objects written in each pack shard. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29builtin/pack-objects.c: remove unnecessary strbuf_reset()Taylor Blau
When reading input with the `--cruft` option, `git pack-objects` reads each line into a strbuf, and then moves it to either the list of discarded or fresh packs, depending on whether or not the input line starts with a '-' character. At the beginning of each loop iteration, the next line of input is read with `strbuf_getline()`, which calls `strbuf_reset()` (as a part of `strbuf_getwholeline()`) before reading the next line of input. Thus, the call to `strbuf_reset()` (added back in b757353676 (builtin/pack-objects.c: --cruft without expiration, 2022-05-20)) at the end of the loop is unnecessary, so let's remove it accordingly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-28Merge branch 'rs/pack-objects-parseopt-fix'Junio C Hamano
Command line parser fix. * rs/pack-objects-parseopt-fix: pack-objects: fix --no-quiet pack-objects: fix --no-keep-true-parents
2023-07-21pack-objects: fix --no-quietRené Scharfe
Since 99fb6e04cb (pack-objects: convert to use parse_options(), 2012-02-01) git pack-objects has accepted the option --no-quiet, but it does the same as --quiet. That's because it's defined using OPT_SET_INT with a value of 0, which sets 0 when negated, too. Make --no-quiet equivalent to --progress and ignore it if --all-progress was given. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21pack-objects: fix --no-keep-true-parentsRené Scharfe
Since 99fb6e04cb (pack-objects: convert to use parse_options(), 2012-02-01) git pack-objects has accepted --no-keep-true-parents, but this option does the same as --keep-true-parents. That's because it's defined using OPT_SET_INT with a value of 0, which sets 0 when negated as well. Turn --no-keep-true-parents into the opposite of --keep-true-parents by using OPT_BOOL and storing the option's status directly in a variable named "grafts_keep_true_parents" instead of in negative form in "grafts_replace_parents". Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17Merge branch 'cw/compat-util-header-cleanup'Junio C Hamano
Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header
2023-07-06Merge branch 'gc/config-context'Junio C Hamano
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-07-05git-compat-util: move alloc macros to git-compat-util.hCalvin Wan
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05treewide: remove unnecessary includes for wrapper.hCalvin Wan
Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-30Merge branch 'en/header-split-cache-h-part-3'Junio C Hamano
Header files cleanup. * en/header-split-cache-h-part-3: (28 commits) fsmonitor-ll.h: split this header out of fsmonitor.h hash-ll, hashmap: move oidhash() to hash-ll object-store-ll.h: split this header out of object-store.h khash: name the structs that khash declares merge-ll: rename from ll-merge git-compat-util.h: remove unneccessary include of wildmatch.h builtin.h: remove unneccessary includes list-objects-filter-options.h: remove unneccessary include diff.h: remove unnecessary include of oidset.h repository: remove unnecessary include of path.h log-tree: replace include of revision.h with simple forward declaration cache.h: remove this no-longer-used header read-cache*.h: move declarations for read-cache.c functions from cache.h repository.h: move declaration of the_index from cache.h merge.h: move declarations for merge.c from cache.h diff.h: move declaration for global in diff.c from cache.h preload-index.h: move declarations for preload-index.c from elsewhere sparse-index.h: move declarations for sparse-index.c from cache.h name-hash.h: move declarations for name-hash.c from cache.h run-command.h: move declarations for run-command.c from cache.h ...
2023-06-29config: pass kvi to die_bad_number()Glen Choo
Plumb "struct key_value_info" through all code paths that end in die_bad_number(), which lets us remove the helper functions that read analogous values from "struct config_reader". As a result, nothing reads config_reader.config_kvi any more, so remove that too. In config.c, this requires changing the signature of git_configset_get_value() to 'return' "kvi" in an out parameter so that git_configset_get_<type>() can pass it to git_config_<type>(). Only numeric types will use "kvi", so for non-numeric types (e.g. git_configset_get_string()), pass NULL to indicate that the out parameter isn't needed. Outside of config.c, config callbacks now need to pass "ctx->kvi" to any of the git_config_<type>() functions that parse a config string into a number type. Included is a .cocci patch to make that refactor. The only exceptional case is builtin/config.c, where git_config_<type>() is called outside of a config callback (namely, on user-provided input), so config source information has never been available. In this case, die_bad_number() defaults to a generic, but perfectly descriptive message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure not to change the message. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29config: add ctx arg to config_fn_tGlen Choo
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-23Merge branch 'ds/disable-replace-refs'Junio C Hamano
Introduce a mechanism to disable replace refs globally and per repository. * ds/disable-replace-refs: repository: create read_replace_refs setting replace-objects: create wrapper around setting repository: create disable_replace_refs()
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12repository: create disable_replace_refs()Derrick Stolee
Several builtins depend on being able to disable the replace references so we actually operate on each object individually. These currently do so by directly mutating the 'read_replace_refs' global. A future change will move this global into a different place, so it will be necessary to change all of these lines. However, we can simplify that transition by abstracting the purpose of these global assignments with a method call. We will need to keep this read_replace_refs global forever, as we want to make sure that we never use replace refs throughout the life of the process if this method is called. Future changes may present a repository-scoped version of the variable to represent that repository's core.useReplaceRefs config value, but a zero-valued read_replace_refs will always override such a setting. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-07attr: teach "--attr-source=<tree>" global option to "git"John Cai
Earlier, 47cfc9bd (attr: add flag `--source` to work with tree-ish, 2023-01-14) taught "git check-attr" the "--source=<tree>" option to allow it to read attribute files from a tree-ish, but did so only for the command. Just like "check-attr" users wanted a way to use attributes from a tree-ish and not from the working tree files, users of other commands (like "git diff") would benefit from the same. Undo most of the UI change the commit made, while keeping the internal logic to read attributes from a given tree-ish. Expose the internal logic via a new "--attr-source=<tree>" command line option given to "git", so that it can be used with any git command that runs as part of the main git process. Additionally, add an environment variable GIT_ATTR_SOURCE that is set when --attr-source is passed in, so that subprocesses use the same value for the attributes source tree. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-28Merge branch 'tb/pack-revindex-on-disk'Junio C Hamano
The on-disk reverse index that allows mapping from the pack offset to the object name for the object stored at the offset has been enabled by default. * tb/pack-revindex-on-disk: t: invert `GIT_TEST_WRITE_REV_INDEX` config: enable `pack.writeReverseIndex` by default pack-revindex: introduce `pack.readReverseIndex` pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK pack-revindex: make `load_pack_revindex` take a repository t5325: mark as leak-free pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-25Merge branch 'ps/fix-geom-repack-with-alternates'Junio C Hamano
Geometric repacking ("git repack --geometric=<n>") in a repository that borrows from an alternate object database had various corner case bugs, which have been corrected. * ps/fix-geom-repack-with-alternates: repack: disable writing bitmaps when doing a local repack repack: honor `-l` when calculating pack geometry t/helper: allow chmtime to print verbosely without modifying mtime pack-objects: extend test coverage of `--stdin-packs` with alternates pack-objects: fix error when same packfile is included and excluded pack-objects: fix error when packing same pack twice pack-objects: split out `--stdin-packs` tests into separate file repack: fix generating multi-pack-index with only non-local packs repack: fix trying to use preferred pack in alternates midx: fix segfault with no packs and invalid preferred pack
2023-04-25Merge branch 'en/header-split-cache-h'Junio C Hamano
Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...
2023-04-14pack-objects: fix error when same packfile is included and excludedPatrick Steinhardt
When passing the same packfile both as included and excluded via the `--stdin-packs` option, then we will return an error because the excluded packfile cannot be found. This is because we will only set the `util` pointer for the included packfile list if it was found, so that we later die when we notice that it's in fact not set for the excluded packfile list. Fix this bug by always setting the `util` pointer for both the included and excluded list entries. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14pack-objects: fix error when packing same pack twicePatrick Steinhardt
When passed the same packfile twice via `--stdin-packs` we return an error that the packfile supposedly was not found. This is because when reading packs into the list of included or excluded packfiles, we will happily re-add packfiles even if they are part of the lists already. And while the list can now contain duplicates, we will only set the `util` pointer of the first list entry to the `packed_git` structure. We notice that at a later point when checking that all list entries have their `util` pointer set and die with an error. While this is kind of a nonsensical request, this scenario can be hit when doing geometric repacks. When a repository is connected to an alternate object directory and both have the exact same packfile then both would get added to the geometric sequence. And when we then decide to perform the repack, we will invoke git-pack-objects(1) with the same packfile twice. Fix this bug by removing any duplicates from both the included and excluded packs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13t: invert `GIT_TEST_WRITE_REV_INDEX`Taylor Blau
Back in e8c58f894b (t: support GIT_TEST_WRITE_REV_INDEX, 2021-01-25), we added a test knob to conditionally enable writing a ".rev" file when indexing a pack. At the time, this was used to ensure that the test suite worked even when ".rev" files were written, which served as a stress-test for the on-disk reverse index implementation. Now that reading from on-disk ".rev" files is enabled by default, the test knob `GIT_TEST_WRITE_REV_INDEX` no longer has any meaning. We could get rid of the option entirely, but there would be no convenient way to test Git when ".rev" files *aren't* in place. Instead of getting rid of the option, invert its meaning to instead disable writing ".rev" files, thereby running the test suite in a mode where the reverse index is generated from scratch. This ensures that, when GIT_TEST_NO_WRITE_REV_INDEX is set to some spelling of "true", we are still running and exercising Git's behavior when forced to generate reverse indexes from scratch. Do so by setting it in the linux-TEST-vars CI run to ensure that we are maintaining good coverage of this now-legacy code. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13config: enable `pack.writeReverseIndex` by defaultTaylor Blau
Back in e37d0b8730 (builtin/index-pack.c: write reverse indexes, 2021-01-25), Git learned how to read and write a pack's reverse index from a file instead of in-memory. A pack's reverse index is a mapping from pack position (that is, the order that objects appear together in a ".pack") to their position in lexical order (that is, the order that objects are listed in an ".idx" file). Reverse indexes are consulted often during pack-objects, as well as during auxiliary operations that require mapping between pack offsets, pack order, and index index. They are useful in GitHub's infrastructure, where we have seen a dramatic increase in performance when writing ".rev" files[1]. In particular: - an ~80% reduction in the time it takes to serve fetches on a popular repository, Homebrew/homebrew-core. - a ~60% reduction in the peak memory usage to serve fetches on that same repository. - a collective savings of ~35% in CPU time across all pack-objects invocations serving fetches across all repositories in a single datacenter. Reverse indexes are also beneficial to end-users as well as forges. For example, the time it takes to generate a pack containing the objects for the 10 most recent commits in linux.git (representing a typical push) is significantly faster when on-disk reverse indexes are available: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~10 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 543.0 ms ± 20.3 ms [User: 616.2 ms, System: 58.8 ms] Range (min … max): 521.0 ms … 577.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 245.0 ms ± 11.4 ms [User: 335.6 ms, System: 31.3 ms] Range (min … max): 226.0 ms … 259.6 ms 13 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran 2.22 ± 0.13 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null' The same is true of writing a pack containing the objects for the 30 most-recent commits: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~30 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 866.5 ms ± 16.2 ms [User: 1414.5 ms, System: 97.0 ms] Range (min … max): 839.3 ms … 886.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 581.6 ms ± 10.2 ms [User: 1181.7 ms, System: 62.6 ms] Range (min … max): 567.5 ms … 599.3 ms 10 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran 1.49 ± 0.04 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ...and savings on trivial operations like computing the on-disk size of a single (packed) object are even more dramatic: $ git rev-parse HEAD >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 305.8 ms ± 11.4 ms [User: 264.2 ms, System: 41.4 ms] Range (min … max): 290.3 ms … 331.1 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 4.0 ms ± 0.3 ms [User: 1.7 ms, System: 2.3 ms] Range (min … max): 1.6 ms … 4.6 ms 1155 runs Summary 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran 76.96 ± 6.25 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in' In the more than two years since e37d0b8730 was merged, Git's implementation of on-disk reverse indexes has been thoroughly tested, both from users enabling `pack.writeReverseIndexes`, and from GitHub's deployment of the feature. The latter has been running without incident for more than two years. This patch changes Git's behavior to write on-disk reverse indexes by default when indexing a pack, which should make the above operations faster for everybody's Git installation after a repack. (The previous commit explains some potential drawbacks of using on-disk reverse indexes in certain limited circumstances, that essentially boil down to a trade-off between time to generate, and time to access. For those limited cases, the `pack.readReverseIndex` escape hatch can be used). [1]: https://github.blog/2021-04-29-scaling-monorepo-maintenance/#reverse-indexes Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11object-file.h: move declarations for object-file.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06Merge branch 'en/header-split-cleanup'Junio C Hamano
Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06Merge branch 'ab/remove-implicit-use-of-the-repository'Junio C Hamano
Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04Merge branch 'ab/remove-implicit-use-of-the-repository' into ↵Junio C Hamano
en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-03-30Merge branch 'sg/parse-options-h-users'Junio C Hamano
Code clean-up to include and/or uninclude parse-options.h file as needed. * sg/parse-options-h-users: treewide: remove unnecessary inclusions of parse-options.h from headers treewide: include parse-options.h in source files
2023-03-28cocci: apply the "promisor-remote.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason
Apply the part of "the_repository.pending.cocci" pertaining to "promisor-remote.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "object-store.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason
Apply the part of "the_repository.pending.cocci" pertaining to "object-store.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21environment.h: move declarations for environment.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21wrapper.h: move declarations for wrapper.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20treewide: include parse-options.h in source filesSZEDER Gábor
The builtins 'ls-remote', 'pack-objects', 'receive-pack', 'reflog' and 'send-pack' use parse_options(), but their source files don't directly include 'parse-options.h'. Furthermore, the source files 'diagnose.c', 'list-objects-filter-options.c', 'remote.c' and 'send-pack.c' define option parsing callback functions, while 'revision.c' defines an option parsing helper function, and thus need access to various fields in 'struct option' and 'struct parse_opt_ctx_t', but they don't directly include 'parse-options.h' either. They all can still be built, of course, because they include one of the header files that does include 'parse-options.h' (though unnecessarily, see the next commit). Add those missing includes to these files, as our general rule is that "a C file must directly include the header files that declare the functions and the types it uses". Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-18Merge branch 'jk/unused-post-2.39-part2'Junio C Hamano
More work towards -Wunused. * jk/unused-post-2.39-part2: (21 commits) help: mark unused parameter in git_unknown_cmd_config() run_processes_parallel: mark unused callback parameters userformat_want_item(): mark unused parameter for_each_commit_graft(): mark unused callback parameter rewrite_parents(): mark unused callback parameter fetch-pack: mark unused parameter in callback function notes: mark unused callback parameters prio-queue: mark unused parameters in comparison functions for_each_object: mark unused callback parameters list-objects: mark unused callback parameters mark unused parameters in signal handlers run-command: mark error routine parameters as unused mark "pointless" data pointers in callbacks ref-filter: mark unused callback parameters http-backend: mark unused parameters in virtual functions http-backend: mark argc/argv unused object-name: mark unused parameters in disambiguate callbacks serve: mark unused parameters in virtual functions serve: use repository pointer to get config ls-refs: drop config caching ...
2023-02-24for_each_object: mark unused callback parametersJeff King
The for_each_{loose,packed}_object interface uses callback functions, but not every callback needs all of the parameters. Mark the unused ones to satisfy -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24list-objects: mark unused callback parametersJeff King
Our graph-traversal functions take callbacks for showing commits and objects, but not all callbacks need each parameter. Likewise for the similar traverse_bitmap_commit_list(), which has a different interface but serves the same purpose. And the include_check mechanism, which passes along a void pointer which is not always used. Mark the unused ones to to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24replace-object.h: move read_replace_refs declaration from cache.h to hereElijah Newren
Adjust several files to be more explicit about their dependency on replace-objects to accommodate this change. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24alloc.h: move ALLOC_GROW() functions from cache.hElijah Newren
This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-16Merge branch 'rs/size-t-fixes'Junio C Hamano
Type fixes. * rs/size-t-fixes: pack-objects: use strcspn(3) in name_cmp_len() read-cache: use size_t for {base,df}_name_compare()
2023-02-10Merge branch 'ew/free-island-marks'Junio C Hamano
"git pack-objects" learned to release delta-island bitmap data when it is done using it, saving peak heap memory usage. * ew/free-island-marks: delta-islands: free island_marks and bitmaps
2023-02-07pack-objects: use strcspn(3) in name_cmp_len()René Scharfe
Call strcspn(3) to find the length of a string terminated by NUL, NL or slash instead of open-coding it. Adopt its return type, size_t, to support strings of arbitrary length. Use that type in callers as well for variables and function parameters that receive the return value. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-04delta-islands: free island_marks and bitmapsEric Wong
On my mirror of linux.git forkgroup with 780 islands, this saves nearly 4G of heap memory in pack-objects. This savings only benefits delta island users of pack bitmaps, as the process would otherwise be exiting anyways. However, there's probably not many delta island users, but the majority of delta island users would also be pack bitmaps users. Signed-off-by: Eric Wong <e@80x24.org> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-14attr: add flag `--source` to work with tree-ishKarthik Nayak
The contents of the .gitattributes files may evolve over time, but "git check-attr" always checks attributes against them in the working tree and/or in the index. It may be beneficial to optionally allow the users to check attributes taken from a commit other than HEAD against paths. Add a new flag `--source` which will allow users to check the attributes against a commit (actually any tree-ish would do). When the user uses this flag, we go through the stack of .gitattributes files but instead of checking the current working tree and/or in the index, we check the blobs from the provided tree-ish object. This allows the command to also be used in bare repositories. Since we use a tree-ish object, the user can pass "--source HEAD:subdirectory" and all the attributes will be looked up as if subdirectory was the root directory of the repository. We cannot simply use the `<rev>:<path>` syntax without the `--source` flag, similar to how it is used in `git show` because any non-flag parameter before `--` is treated as an attribute and any parameter after `--` is treated as a pathname. The change involves creating a new function `read_attr_from_blob`, which given the path reads the blob for the path against the provided source and parses the attributes line by line. This function is plugged into `read_attr()` function wherein we go through the stack of attributes files. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Toon Claes <toon@iotcl.com> Co-authored-by: toon@iotcl.com Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30pack-objects: simplify --filter handlingRené Scharfe
pack-objects uses OPT_PARSE_LIST_OBJECTS_FILTER_INIT() to initialize the a rev_info struct lazily before populating its filter member using the --filter option values. It tracks whether the initialization is needed using the .have_revs member of the callback data. There is a better way: Use a stand-alone list_objects_filter_options struct and build a rev_info struct with its .filter member after option parsing. This allows using the simpler OPT_PARSE_LIST_OBJECTS_FILTER() and getting rid of the extra callback mechanism. Even simpler would be using a struct rev_info as before 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't leak, 2022-03-28), but that would expose a memory leak caused by repo_init_revisions() followed by release_revisions() without a setup_revisions() call in between. Using list_objects_filter_options also allows pushing the rev_info struct into get_object_list(), where it arguably belongs. Either way, this is all left for later. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>