From de41d03e1c7ab73174716c99b8eaf7ff5884d6bb Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Wed, 12 Jul 2023 19:37:27 -0400 Subject: packfile.c: prevent overflow in `nth_packed_object_id()` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In 37fec86a83 (packfile: abstract away hash constant values, 2018-05-02), `nth_packed_object_id()` started using the variable `the_hash_algo->rawsz` instead of a fixed constant when trying to compute an offset into the ".idx" file for some object position. This can lead to surprising truncation when looking for an object towards the end of a large enough pack, like the following: (gdb) p hashsz $1 = 20 (gdb) p n $2 = 215043814 (gdb) p hashsz * n $3 = 5908984 , which is a debugger session broken on a known-bad call to the `nth_packed_object_id()` function. This behavior predates 37fec86a83, and is original to the v2 index format, via: 74e34e1fca (sha1_file.c: learn about index version 2, 2007-04-09). This is due to §6.4.4.1 of the C99 standard, which states that an untyped integer constant will take the first type in which the value can be accurately represented, among `int`, `long int`, and `long long int`. Since 20 can be represented as an `int`, and `n` is a 32-bit unsigned integer, the resulting computation is defined by §6.3.1.8, and the (signed) integer value representing `n` is converted to an unsigned type, meaning that `20 * n` (for `n` having type `uint32_t`) is equivalent to a multiplication between two unsigned 32-bit integers. When multiplying a sufficiently large `n`, the resulting value can exceed 2^32-1, wrapping around and producing an invalid result. Let's follow the example in f86f769550e (compute pack .idx byte offsets using size_t, 2020-11-13) and replace this computation with `st_mult()`, which will ensure that the computation is done using 64-bits. While here, guard the corresponding computation for packs with v1 indexes, too. Though the likelihood of seeing a bug there is much smaller, since (a) v1 indexes are generated far less frequently than v2 indexes, and (b) they all correspond to packs no larger than 2 GiB, so having enough objects to trigger this overflow is unlikely if not impossible. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- packfile.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'packfile.c') diff --git a/packfile.c b/packfile.c index fd083c86e0..5ee67de569 100644 --- a/packfile.c +++ b/packfile.c @@ -1920,10 +1920,10 @@ int nth_packed_object_id(struct object_id *oid, return -1; index += 4 * 256; if (p->index_version == 1) { - oidread(oid, index + (hashsz + 4) * n + 4); + oidread(oid, index + st_add(st_mult(hashsz + 4, n), 4)); } else { index += 8; - oidread(oid, index + hashsz * n); + oidread(oid, index + st_mult(hashsz, n)); } return 0; } -- cgit v1.2.3 From 42be681b33ef73be056fb99e3c63c6e9b9c2e7ef Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Thu, 13 Jul 2023 20:54:54 -0400 Subject: packfile.c: prevent overflow in `load_idx()` Prevent an overflow when locating a pack's CRC offset when the number of packed items is greater than 2^32-1/hashsz by guarding the computation with an `st_mult()`. Note that to avoid truncating the result, the `crc_offset` member must itself become a `size_t`. The only usage of this variable (besides the assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the index-pack code. There we use the `crc_offset` as a pointer offset, so we are already equipped to handle the type change. Helped-by: Phillip Wood Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- packfile.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'packfile.c') diff --git a/packfile.c b/packfile.c index 5ee67de569..efe4a22c63 100644 --- a/packfile.c +++ b/packfile.c @@ -186,7 +186,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map, */ (sizeof(off_t) <= 4)) return error("pack too large for current definition of off_t in %s", path); - p->crc_offset = 8 + 4 * 256 + nr * hashsz; + p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz)); } p->index_version = version; -- cgit v1.2.3 From a519abca02eeca7dce864717b9664c62a124e1c0 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Wed, 12 Jul 2023 19:37:32 -0400 Subject: packfile.c: use checked arithmetic in `nth_packed_object_offset()` In a similar spirit as the previous commits, ensure that we use `st_add()` or `st_mult()` when computing values that may overflow the 32-bit unsigned limit. Note that in each of these instances, we prevent 32-bit overflow already since we have explicit casts to `size_t`. So this code is OK as-is, but let's clarify it by using the `st_xyz()` helpers to make it obvious that we are performing the relevant computations using 64 bits. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- packfile.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) (limited to 'packfile.c') diff --git a/packfile.c b/packfile.c index efe4a22c63..70837b0d26 100644 --- a/packfile.c +++ b/packfile.c @@ -1948,14 +1948,15 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n) const unsigned int hashsz = the_hash_algo->rawsz; index += 4 * 256; if (p->index_version == 1) { - return ntohl(*((uint32_t *)(index + (hashsz + 4) * (size_t)n))); + return ntohl(*((uint32_t *)(index + st_mult(hashsz + 4, n)))); } else { uint32_t off; - index += 8 + (size_t)p->num_objects * (hashsz + 4); - off = ntohl(*((uint32_t *)(index + 4 * n))); + index += st_add(8, st_mult(p->num_objects, hashsz + 4)); + off = ntohl(*((uint32_t *)(index + st_mult(4, n)))); if (!(off & 0x80000000)) return off; - index += (size_t)p->num_objects * 4 + (off & 0x7fffffff) * 8; + index += st_add(st_mult(p->num_objects, 4), + st_mult(off & 0x7fffffff, 8)); check_pack_index_ptr(p, index); return get_be64(index); } -- cgit v1.2.3