From a43739031092f951674d783ad3bdcbd603281851 Mon Sep 17 00:00:00 2001 From: Atharva Raykar Date: Thu, 8 Apr 2021 14:44:43 +0530 Subject: userdiff: add support for Scheme Add a diff driver for Scheme-like languages which recognizes top level and local `define` forms, whether it is a function definition, binding, syntax definition or a user-defined `define-xyzzy` form. Also supports R6RS `library` forms, `module` forms along with class and struct declarations used in Racket (PLT Scheme). Alternate "def" syntax such as those in Gerbil Scheme are also supported, like defstruct, defsyntax and so on. The rationale for picking `define` forms for the hunk headers is because it is usually the only significant form for defining the structure of the program, and it is a common pattern for schemers to have local function definitions to hide their visibility, so it is not only the top level `define`'s that are of interest. Schemers also extend the language with macros to provide their own define forms (for example, something like a `define-test-suite`) which is also captured in the hunk header. Since it is common practice to extend syntax with variants of a form like `module+`, `class*` etc, those have been supported as well. The word regex is a best-effort attempt to conform to R7RS[1] valid identifiers, symbols and numbers. [1] https://small.r7rs.org/attachment/r7rs.pdf (section 2.1) Signed-off-by: Atharva Raykar Signed-off-by: Junio C Hamano --- userdiff.c | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'userdiff.c') diff --git a/userdiff.c b/userdiff.c index 3f81a2261c..3897317aff 100644 --- a/userdiff.c +++ b/userdiff.c @@ -191,6 +191,15 @@ PATTERNS("rust", "[a-zA-Z_][a-zA-Z0-9_]*" "|[0-9][0-9_a-fA-Fiosuxz]*(\\.([0-9]*[eE][+-]?)?[0-9_fF]*)?" "|[-+*\\/<>%&^|=!:]=|<<=?|>>=?|&&|\\|\\||->|=>|\\.{2}=|\\.{3}|::"), +PATTERNS("scheme", + "^[\t ]*(\\(((define|def(struct|syntax|class|method|rules|record|proto|alias)?)[-*/ \t]|(library|module|struct|class)[*+ \t]).*)$", + /* + * R7RS valid identifiers include any sequence enclosed + * within vertical lines having no backslashes + */ + "\\|([^\\\\]*)\\|" + /* All other words should be delimited by spaces or parentheses */ + "|([^][)(}{[ \t])+"), PATTERNS("bibtex", "(@[a-zA-Z]{1,}[ \t]*\\{{0,1}[ \t]*[^ \t\"@',\\#}{~%]*).*$", "[={}\"]|[^={}\" \t]+"), PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$", -- cgit v1.2.3