Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorÆvar Arnfjörð Bjarmason <avarab@gmail.com>2021-10-01 12:16:53 +0300
committerJunio C Hamano <gitster@pobox.com>2021-10-02 01:06:01 +0300
commit96e41f58fe1a5aeadf2bf1c1850c53a1c1144bbc (patch)
tree2c032ccef40de9f7d85e6063a36bafaa01c8d5ce
parent31deb28f5e0c85e8bd556ba135e5f0e0926bad7a (diff)
fsck: report invalid object type-path combinations
Improve the error that's emitted in cases where we find a loose object we parse, but which isn't at the location we expect it to be. Before this change we'd prefix the error with a not-a-OID derived from the path at which the object was found, due to an emergent behavior in how we'd end up with an "OID" in these codepaths. Now we'll instead say what object we hashed, and what path it was found at. Before this patch series e.g.: $ git hash-object --stdin -w -t blob </dev/null e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ mv objects/e6/ objects/e7 Would emit ("[...]" used to abbreviate the OIDs): git fsck error: hash mismatch for ./objects/e7/9d[...] (expected e79d[...]) error: e79d[...]: object corrupt or missing: ./objects/e7/9d[...] Now we'll instead emit: error: e69d[...]: hash-path mismatch, found at: ./objects/e7/9d[...] Furthermore, we'll do the right thing when the object type and its location are bad. I.e. this case: $ git hash-object --stdin -w -t garbage --literally </dev/null 8315a83d2acc4c174aed59430f9a9c4ed926440f $ mv objects/83 objects/84 As noted in an earlier commits we'd simply die early in those cases, until preceding commits fixed the hard die on invalid object type: $ git fsck fatal: invalid object type Now we'll instead emit sensible error messages: $ git fsck error: 8315[...]: hash-path mismatch, found at: ./objects/84/15[...] error: 8315[...]: object is of unknown type 'garbage': ./objects/84/15[...] In both fsck.c and object-file.c we're using null_oid as a sentinel value for checking whether we got far enough to be certain that the issue was indeed this OID mismatch. We need to add the "object corrupt or missing" special-case to deal with cases where read_loose_object() will return an error before completing check_object_signature(), e.g. if we have an error in unpack_loose_rest() because we find garbage after the valid gzip content: $ git hash-object --stdin -w -t blob </dev/null e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ chmod 755 objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ echo garbage >>objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ git fsck error: garbage at end of loose object 'e69d[...]' error: unable to unpack contents of ./objects/e6/9d[...] error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...] There is currently some weird messaging in the edge case when the two are combined, i.e. because we're not explicitly passing along an error state about this specific scenario from check_stream_oid() via read_loose_object() we'll end up printing the null OID if an object is of an unknown type *and* it can't be unpacked by zlib, e.g.: $ git hash-object --stdin -w -t garbage --literally </dev/null 8315a83d2acc4c174aed59430f9a9c4ed926440f $ chmod 755 objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f $ echo garbage >>objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f $ /usr/bin/git fsck fatal: invalid object type $ ~/g/git/git fsck error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f' error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f [...] I think it's OK to leave that for future improvements, which would involve enum-ifying more error state as we've done with "enum unpack_loose_header_result" in preceding commits. In these increasingly more obscure cases the worst that can happen is that we'll get slightly nonsensical or inapplicable error messages. There's other such potential edge cases, all of which might produce some confusing messaging, but still be handled correctly as far as passing along errors goes. E.g. if check_object_signature() returns and oideq(real_oid, null_oid()) is true, which could happen if it returns -1 due to the read_istream() call having failed. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-rw-r--r--builtin/fast-export.c2
-rw-r--r--builtin/fsck.c15
-rw-r--r--builtin/index-pack.c2
-rw-r--r--builtin/mktag.c3
-rw-r--r--cache.h3
-rw-r--r--object-file.c21
-rw-r--r--object-store.h1
-rw-r--r--object.c4
-rw-r--r--pack-check.c3
-rwxr-xr-xt/t1006-cat-file.sh2
-rwxr-xr-xt/t1450-fsck.sh8
11 files changed, 38 insertions, 26 deletions
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 3c20f164f0..48a3b6a7f8 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -312,7 +312,7 @@ static void export_blob(const struct object_id *oid)
if (!buf)
die("could not read blob %s", oid_to_hex(oid));
if (check_object_signature(the_repository, oid, buf, size,
- type_name(type)) < 0)
+ type_name(type), NULL) < 0)
die("oid mismatch in blob %s", oid_to_hex(oid));
object = parse_object_buffer(the_repository, oid, type,
size, buf, &eaten);
diff --git a/builtin/fsck.c b/builtin/fsck.c
index f47b9234ed..1a023914a7 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -607,6 +607,7 @@ static int fsck_loose(const struct object_id *oid, const char *path, void *data)
void *contents;
int eaten;
struct object_info oi = OBJECT_INFO_INIT;
+ struct object_id real_oid = *null_oid();
int err = 0;
strbuf_reset(&cb_data->obj_type);
@@ -614,12 +615,18 @@ static int fsck_loose(const struct object_id *oid, const char *path, void *data)
oi.sizep = &size;
oi.typep = &type;
- if (read_loose_object(path, oid, &contents, &oi) < 0)
- err = error(_("%s: object corrupt or missing: %s"),
- oid_to_hex(oid), path);
+ if (read_loose_object(path, oid, &real_oid, &contents, &oi) < 0) {
+ if (contents && !oideq(&real_oid, oid))
+ err = error(_("%s: hash-path mismatch, found at: %s"),
+ oid_to_hex(&real_oid), path);
+ else
+ err = error(_("%s: object corrupt or missing: %s"),
+ oid_to_hex(oid), path);
+ }
if (type != OBJ_NONE && type < 0)
err = error(_("%s: object is of unknown type '%s': %s"),
- oid_to_hex(oid), cb_data->obj_type.buf, path);
+ oid_to_hex(&real_oid), cb_data->obj_type.buf,
+ path);
if (err < 0) {
errors_found |= ERROR_OBJECT;
return 0; /* keep checking other objects */
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 3fbc5d7077..bf860b6555 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1421,7 +1421,7 @@ static void fix_unresolved_deltas(struct hashfile *f)
if (check_object_signature(the_repository, &d->oid,
data, size,
- type_name(type)))
+ type_name(type), NULL))
die(_("local object %s is corrupt"), oid_to_hex(&d->oid));
/*
diff --git a/builtin/mktag.c b/builtin/mktag.c
index dddcccdd36..3b2dbbb37e 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -62,7 +62,8 @@ static int verify_object_in_tag(struct object_id *tagged_oid, int *tagged_type)
repl = lookup_replace_object(the_repository, tagged_oid);
ret = check_object_signature(the_repository, repl,
- buffer, size, type_name(*tagged_type));
+ buffer, size, type_name(*tagged_type),
+ NULL);
free(buffer);
return ret;
diff --git a/cache.h b/cache.h
index 1181304f3f..4c0901f6e1 100644
--- a/cache.h
+++ b/cache.h
@@ -1344,7 +1344,8 @@ struct object_info;
int parse_loose_header(const char *hdr, struct object_info *oi);
int check_object_signature(struct repository *r, const struct object_id *oid,
- void *buf, unsigned long size, const char *type);
+ void *buf, unsigned long size, const char *type,
+ struct object_id *real_oidp);
int finalize_object_file(const char *tmpfile, const char *filename);
diff --git a/object-file.c b/object-file.c
index dd80d4b161..4c258703a0 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1039,9 +1039,11 @@ void *xmmap(void *start, size_t length,
* the streaming interface and rehash it to do the same.
*/
int check_object_signature(struct repository *r, const struct object_id *oid,
- void *map, unsigned long size, const char *type)
+ void *map, unsigned long size, const char *type,
+ struct object_id *real_oidp)
{
- struct object_id real_oid;
+ struct object_id tmp;
+ struct object_id *real_oid = real_oidp ? real_oidp : &tmp;
enum object_type obj_type;
struct git_istream *st;
git_hash_ctx c;
@@ -1049,8 +1051,8 @@ int check_object_signature(struct repository *r, const struct object_id *oid,
int hdrlen;
if (map) {
- hash_object_file(r->hash_algo, map, size, type, &real_oid);
- return !oideq(oid, &real_oid) ? -1 : 0;
+ hash_object_file(r->hash_algo, map, size, type, real_oid);
+ return !oideq(oid, real_oid) ? -1 : 0;
}
st = open_istream(r, oid, &obj_type, &size, NULL);
@@ -1075,9 +1077,9 @@ int check_object_signature(struct repository *r, const struct object_id *oid,
break;
r->hash_algo->update_fn(&c, buf, readlen);
}
- r->hash_algo->final_oid_fn(&real_oid, &c);
+ r->hash_algo->final_oid_fn(real_oid, &c);
close_istream(st);
- return !oideq(oid, &real_oid) ? -1 : 0;
+ return !oideq(oid, real_oid) ? -1 : 0;
}
int git_open_cloexec(const char *name, int flags)
@@ -2520,6 +2522,7 @@ static int check_stream_oid(git_zstream *stream,
int read_loose_object(const char *path,
const struct object_id *expected_oid,
+ struct object_id *real_oid,
void **contents,
struct object_info *oi)
{
@@ -2530,8 +2533,6 @@ int read_loose_object(const char *path,
char hdr[MAX_HEADER_LEN];
unsigned long *size = oi->sizep;
- *contents = NULL;
-
map = map_loose_object_1(the_repository, path, NULL, &mapsize);
if (!map) {
error_errno(_("unable to mmap %s"), path);
@@ -2561,9 +2562,7 @@ int read_loose_object(const char *path,
goto out;
}
if (check_object_signature(the_repository, expected_oid,
- *contents, *size, oi->type_name->buf)) {
- error(_("hash mismatch for %s (expected %s)"), path,
- oid_to_hex(expected_oid));
+ *contents, *size, oi->type_name->buf, real_oid)) {
free(*contents);
goto out;
}
diff --git a/object-store.h b/object-store.h
index 3eb597a82a..6b9ffcffb2 100644
--- a/object-store.h
+++ b/object-store.h
@@ -244,6 +244,7 @@ int force_object_loose(const struct object_id *oid, time_t mtime);
*/
int read_loose_object(const char *path,
const struct object_id *expected_oid,
+ struct object_id *real_oid,
void **contents,
struct object_info *oi);
diff --git a/object.c b/object.c
index 14188453c5..5467ead328 100644
--- a/object.c
+++ b/object.c
@@ -261,7 +261,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
(!obj && repo_has_object_file(r, oid) &&
oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
- if (check_object_signature(r, repl, NULL, 0, NULL) < 0) {
+ if (check_object_signature(r, repl, NULL, 0, NULL, NULL) < 0) {
error(_("hash mismatch %s"), oid_to_hex(oid));
return NULL;
}
@@ -272,7 +272,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
buffer = repo_read_object_file(r, oid, &type, &size);
if (buffer) {
if (check_object_signature(r, repl, buffer, size,
- type_name(type)) < 0) {
+ type_name(type), NULL) < 0) {
free(buffer);
error(_("hash mismatch %s"), oid_to_hex(repl));
return NULL;
diff --git a/pack-check.c b/pack-check.c
index 4b089fe8ec..e6aa4442c9 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -142,7 +142,8 @@ static int verify_packfile(struct repository *r,
err = error("cannot unpack %s from %s at offset %"PRIuMAX"",
oid_to_hex(&oid), p->pack_name,
(uintmax_t)entries[i].offset);
- else if (check_object_signature(r, &oid, data, size, type_name(type)))
+ else if (check_object_signature(r, &oid, data, size,
+ type_name(type), NULL))
err = error("packed %s from %s is corrupt",
oid_to_hex(&oid), p->pack_name);
else if (fn) {
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 4b55adf06a..fe302f2818 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -512,7 +512,7 @@ test_expect_success 'cat-file -t and -s on corrupt loose object' '
# Swap the two to corrupt the repository
mv -f "$other_path" "$empty_path" &&
test_must_fail git fsck 2>err.fsck &&
- grep "hash mismatch" err.fsck &&
+ grep "hash-path mismatch" err.fsck &&
# confirm that cat-file is reading the new swapped-in
# blob...
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index faf0e98847..6337236fd8 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -54,6 +54,7 @@ test_expect_success 'object with hash mismatch' '
cd hash-mismatch &&
oid=$(echo blob | git hash-object -w --stdin) &&
+ oldoid=$oid &&
old=$(test_oid_to_path "$oid") &&
new=$(dirname $old)/$(test_oid ff_2) &&
oid="$(dirname $new)$(basename $new)" &&
@@ -65,7 +66,7 @@ test_expect_success 'object with hash mismatch' '
git update-ref refs/heads/bogus $cmt &&
test_must_fail git fsck 2>out &&
- grep "$oid.*corrupt" out
+ grep "$oldoid: hash-path mismatch, found at: .*$new" out
)
'
@@ -75,6 +76,7 @@ test_expect_success 'object with hash and type mismatch' '
cd hash-type-mismatch &&
oid=$(echo blob | git hash-object -w --stdin -t garbage --literally) &&
+ oldoid=$oid &&
old=$(test_oid_to_path "$oid") &&
new=$(dirname $old)/$(test_oid ff_2) &&
oid="$(dirname $new)$(basename $new)" &&
@@ -87,8 +89,8 @@ test_expect_success 'object with hash and type mismatch' '
test_must_fail git fsck 2>out &&
- grep "^error: hash mismatch for " out &&
- grep "^error: $oid: object is of unknown type '"'"'garbage'"'"'" out
+ grep "^error: $oldoid: hash-path mismatch, found at: .*$new" out &&
+ grep "^error: $oldoid: object is of unknown type '"'"'garbage'"'"'" out
)
'