From fd680bc5586ab7c846e03e181e033dbc36cc7d5d Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 27 Aug 2021 14:30:15 -0400 Subject: logmsg_reencode(): warn when iconv() fails If the user asks for a pretty-printed commit to be converted (either explicitly with --encoding=foo, or implicitly because the commit is non-utf8 and we want to convert it), we pass it through iconv(). If that fails, we fall back to showing the input verbatim, but don't tell the user that the output may be bogus. Let's add a warning to do so, along with a mention in the documentation for --encoding. Two things to note about the implementation: - we could produce the warning closer to the call to iconv() in reencode_string_len(), which would let us relay the value of errno. But this is not actually very helpful. reencode_string_len() does not know we are operating on a commit, and indeed does not know that the caller won't produce an error of its own. And the errno values from iconv() are seldom helpful (iconv_open() only ever produces EINVAL; perhaps EILSEQ from iconv() might be illuminating, but it can also return EINVAL for incomplete sequences). - if the reason for the failure is that the output charset is not supported, then the user will see this warning for every commit we try to display. That might be ugly and overwhelming, but on the other hand it is making it clear that every one of them has not been converted (and the likely outcome anyway is to re-try the command with a supported output encoding). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- Documentation/pretty-options.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/pretty-options.txt b/Documentation/pretty-options.txt index 27ddaf84a1..42b227bc40 100644 --- a/Documentation/pretty-options.txt +++ b/Documentation/pretty-options.txt @@ -40,7 +40,9 @@ people using 80-column terminals. defaults to UTF-8. Note that if an object claims to be encoded in `X` and we are outputting in `X`, we will output the object verbatim; this means that invalid sequences in the original - commit may be copied to the output. + commit may be copied to the output. Likewise, if iconv(3) fails + to convert the commit, we will output the original object + verbatim, along with a warning. --expand-tabs=:: --expand-tabs:: -- cgit v1.2.3 From 1e93770888d3e71422f9f8defab216f1ebf977c3 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 27 Aug 2021 14:32:06 -0400 Subject: docs: use "character encoding" to refer to commit-object encoding The word "encoding" can mean a lot of things (e.g., base64 or quoted-printable encoding in emails, HTML entities, URL encoding, and so on). The documentation for i18n.commitEncoding and i18n.logOutputEncoding uses the phrase "character encoding" to make this more clear. Let's use that phrase in other places to make it clear what kind of encoding we are talking about. This patch covers the gui.encoding option, as well as the --encoding option for git-log, etc (in this latter case, I word-smithed the sentence a little at the same time). That, coupled with the mention of iconv in the --encoding description, should make this more clear. The other spot I looked at is the working-tree-encoding section of gitattributes(5). But it gives specific examples of encodings that I think make the meaning pretty clear already. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- Documentation/config/gui.txt | 2 +- Documentation/pretty-options.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/config/gui.txt b/Documentation/config/gui.txt index d30831a130..0c087fd8c9 100644 --- a/Documentation/config/gui.txt +++ b/Documentation/config/gui.txt @@ -11,7 +11,7 @@ gui.displayUntracked:: in the file list. The default is "true". gui.encoding:: - Specifies the default encoding to use for displaying of + Specifies the default character encoding to use for displaying of file contents in linkgit:git-gui[1] and linkgit:gitk[1]. It can be overridden by setting the 'encoding' attribute for relevant files (see linkgit:gitattributes[5]). diff --git a/Documentation/pretty-options.txt b/Documentation/pretty-options.txt index 42b227bc40..b3af850608 100644 --- a/Documentation/pretty-options.txt +++ b/Documentation/pretty-options.txt @@ -33,7 +33,7 @@ people using 80-column terminals. used together. --encoding=:: - The commit objects record the encoding used for the log message + Commit objects record the character encoding used for the log message in their encoding header; this option can be used to tell the command to re-code the commit log message in the encoding preferred by the user. For non plumbing commands this -- cgit v1.2.3