Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/ci
diff options
context:
space:
mode:
authorJunio C Hamano <gitster@pobox.com>2021-12-11 01:35:06 +0300
committerJunio C Hamano <gitster@pobox.com>2021-12-11 01:35:06 +0300
commitbd16b3c39f373944d5ae0c8c9bb93d5d4a5a9524 (patch)
tree0045810462c348517e2885818aa8e7d4d1c1af45 /ci
parent3d2dce168f1c004e746d18d0b2dfc55bc8af455a (diff)
parent0e7696c64db78f698c40686d4869e2a8d0ab2696 (diff)
Merge branch 'js/ci-no-directional-formatting'
CI has been taught to catch some Unicode directional formatting sequence that can be used in certain mischief. * js/ci-no-directional-formatting: ci: disallow directional formatting
Diffstat (limited to 'ci')
-rwxr-xr-xci/check-directional-formatting.bash27
1 files changed, 27 insertions, 0 deletions
diff --git a/ci/check-directional-formatting.bash b/ci/check-directional-formatting.bash
new file mode 100755
index 0000000000..e6211b141a
--- /dev/null
+++ b/ci/check-directional-formatting.bash
@@ -0,0 +1,27 @@
+#!/bin/bash
+
+# This script verifies that the non-binary files tracked in the Git index do
+# not contain any Unicode directional formatting: such formatting could be used
+# to deceive reviewers into interpreting code differently from the compiler.
+# This is intended to run on an Ubuntu agent in a GitHub workflow.
+#
+# To allow translated messages to introduce such directional formatting in the
+# future, we exclude the `.po` files from this validation.
+#
+# Neither GNU grep nor `git grep` (not even with `-P`) handle `\u` as a way to
+# specify UTF-8.
+#
+# To work around that, we use `printf` to produce the pattern as a byte
+# sequence, and then feed that to `git grep` as a byte sequence (setting
+# `LC_CTYPE` to make sure that the arguments are interpreted as intended).
+#
+# Note: we need to use Bash here because its `printf` interprets `\uNNNN` as
+# UTF-8 code points, as desired. Running this script through Ubuntu's `dash`,
+# for example, would use a `printf` that does not understand that syntax.
+
+# U+202a..U+2a2e: LRE, RLE, PDF, LRO and RLO
+# U+2066..U+2069: LRI, RLI, FSI and PDI
+regex='(\u202a|\u202b|\u202c|\u202d|\u202e|\u2066|\u2067|\u2068|\u2069)'
+
+! LC_CTYPE=C git grep -El "$(LC_CTYPE=C.UTF-8 printf "$regex")" \
+ -- ':(exclude,attr:binary)' ':(exclude)*.po'