utf8: fix truncated string lengths in `utf8_strnwidth()`

The `utf8_strnwidth()` function accepts an optional string length as input parameter. This parameter can either be set to `-1`, in which case we call `strlen()` on the input. Or it can be set to a positive integer that indicates a precomputed length, which callers typically compute by calling `strlen()` at some point themselves. The input parameter is an `int` though, whereas `strlen()` returns a `size_t`. This can lead to implementation-defined behaviour though when the `size_t` cannot be represented by the `int`. In the general case though this leads to wrap-around and thus to negative string sizes, which is sure enough to not lead to well-defined behaviour. Fix this by accepting a `size_t` instead of an `int` as string length. While this takes away the ability of callers to simply pass in `-1` as string length, it really is trivial enough to convert them to instead pass in `strlen()` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
author: Patrick Steinhardt <ps@pks.im> 2022-12-01 17:46:53 +0300
committer: Junio C Hamano <gitster@pobox.com> 2022-12-09 08:26:21 +0300
commit: 522cc87fdc25449222a5894a428eebf4b8d5eaa9 (patch)
tree: 05b0e6b3fc523e14d7eb612abe30520681313a26 /utf8.c
parent: 48050c42c73c28b0c001d63d11dffac7e116847b (diff)
1 files changed, 3 insertions, 5 deletions
diff --git a/utf8.c b/utf8.c
index 5b39361ada..504e517c34 100644
--- a/utf8.c
+++ b/utf8.c
@@ -206,13 +206,11 @@ int utf8_width(const char **start, size_t *remainder_p)
  * string, assuming that the string is utf8.  Returns strlen() instead
  * if the string does not look like a valid utf8 string.
  */
-int utf8_strnwidth(const char *string, int len, int skip_ansi)
+int utf8_strnwidth(const char *string, size_t len, int skip_ansi)
 {
 	int width = 0;
 	const char *orig = string;
 
-	if (len == -1)
-		len = strlen(string);
 	while (string && string < orig + len) {
 		int skip;
 		while (skip_ansi &&
@@ -225,7 +223,7 @@ int utf8_strnwidth(const char *string, int len, int skip_ansi)
 
 int utf8_strwidth(const char *string)
 {
-	return utf8_strnwidth(string, -1, 0);
+	return utf8_strnwidth(string, strlen(string), 0);
 }
 
 int is_utf8(const char *text)
@@ -791,7 +789,7 @@ int skip_utf8_bom(char **text, size_t len)
 void strbuf_utf8_align(struct strbuf *buf, align_type position, unsigned int width,
 		       const char *s)
 {
-	int slen = strlen(s);
+	size_t slen = strlen(s);
 	int display_len = utf8_strnwidth(s, slen, 0);
 	int utf8_compensation = slen - display_len;
author	Patrick Steinhardt <ps@pks.im>	2022-12-01 17:46:53 +0300
committer	Junio C Hamano <gitster@pobox.com>	2022-12-09 08:26:21 +0300
commit	522cc87fdc25449222a5894a428eebf4b8d5eaa9 (patch)
tree	05b0e6b3fc523e14d7eb612abe30520681313a26 /utf8.c
parent	48050c42c73c28b0c001d63d11dffac7e116847b (diff)