From 386076ec92c702104cb15bc23e4521dac10c7c2d Mon Sep 17 00:00:00 2001 From: Johannes Sixt Date: Sun, 24 Oct 2021 11:56:43 +0200 Subject: userdiff-cpp: back out the digit-separators in numbers The implementation of digit-separating single-quotes introduced a note-worthy regression: the change of a character literal with a digit would splice the digit and the closing single-quote. For example, the change from 'a' to '2' is now tokenized as '[-a'-]{+2'+} instead of '[-a-]{+2+}'. The options to fix the regression are: - Tighten the regular expression such that the single-quote can only occur between digits (that would match the official syntax). - Remove support for digit separators. I chose to remove support, because - I have not seen a lot of code make use of digit separators. - If code does use digit separators, then the numbers are typically long. If a change in one of the segments occurs, it is actually better visible if only that segment is highlighted as the word that changed instead of the whole long number. This choice does introduce another minor regression, though, which is highlighted in the test case: when a change occurs in the second or later segment of a hexadecimal number where the segment begins with a digit, but also has letters, the segment is mistaken as consisting of a number and an identifier. I can live with that. Signed-off-by: Johannes Sixt Signed-off-by: Junio C Hamano --- userdiff.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'userdiff.c') diff --git a/userdiff.c b/userdiff.c index 7b143ef36b..8578cb0d12 100644 --- a/userdiff.c +++ b/userdiff.c @@ -67,11 +67,11 @@ PATTERNS("cpp", /* identifiers and keywords */ "[a-zA-Z_][a-zA-Z0-9_]*" /* decimal and octal integers as well as floatingpoint numbers */ - "|[0-9][0-9.']*([Ee][-+]?[0-9]+)?[fFlLuU]*" + "|[0-9][0-9.]*([Ee][-+]?[0-9]+)?[fFlLuU]*" /* hexadecimal and binary integers */ - "|0[xXbB][0-9a-fA-F']+[lLuU]*" + "|0[xXbB][0-9a-fA-F]+[lLuU]*" /* floatingpoint numbers that begin with a decimal point */ - "|\\.[0-9][0-9']*([Ee][-+]?[0-9]+)?[fFlL]?" + "|\\.[0-9][0-9]*([Ee][-+]?[0-9]+)?[fFlL]?" "|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->\\*?|\\.\\*|<=>"), PATTERNS("csharp", /* Keywords */ -- cgit v1.2.3