Welcome to mirror list, hosted at ThFree Co, Russian Federation.

git.kernel.org/pub/scm/git/git.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-04-05Merge branch 'pw/wildmatch-fixes'Junio C Hamano
The wildmatch library code unlearns exponential behaviour it acquired some time ago since it was borrowed from rsync. * pw/wildmatch-fixes: t3070: make chain lint tester happy wildmatch: hide internal return values wildmatch: avoid undefined behavior wildmatch: fix exponential behavior
2023-03-20wildmatch: hide internal return valuesPhillip Wood
WM_ABORT_ALL and WM_ABORT_TO_STARSTAR are used internally to limit backtracking when a match fails, they are not of interest to the caller and so should not be public. Suggested-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20wildmatch: avoid undefined behaviorPhillip Wood
The code changed in this commit is designed to check if the pattern starts with "**/" or contains "/**/" (see 3a078dec33 (wildmatch: fix "**" special case, 2013-01-01)). Unfortunately when the pattern begins with "**/" `prev_p = p - 2` is evaluated when `p` points to the second "*" and so the subtraction is undefined according to section 6.5.6 of the C standard because the result does not point within the same object as `p`. Fix this by avoiding the subtraction unless it is well defined. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20wildmatch: fix exponential behaviorPhillip Wood
When dowild() cannot match a '*' or '/**/' wildcard then it must return WM_ABORT_TO_STARSTAR or WM_ABORT_ALL respectively. Failure to observe this results in unnecessary backtracking and the time taken for a failed match increases exponentially with the number of wildcards in the pattern [1]. Unfortunately in some instances dowild() returns WM_NOMATCH for a failed match resulting in long match times for patterns containing multiple wildcards as can be seen in the following benchmark. (Note that the timings in the Benchmark 1 are really measuring the time to execute test-tool rather than the time to match the pattern) Benchmark 1: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a" Time (mean ± σ): 22.8 ms ± 1.7 ms [User: 12.1 ms, System: 10.6 ms] Range (min … max): 19.4 ms … 26.9 ms 113 runs Warning: Ignoring non-zero exit code. Benchmark 2: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a" Time (mean ± σ): 5.244 s ± 0.228 s [User: 5.229 s, System: 0.010 s] Range (min … max): 4.969 s … 5.707 s 10 runs Warning: Ignoring non-zero exit code. Summary 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"' ran 230.37 ± 20.04 times faster than 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"' The security implications are limited as it only affects operations that are potentially DoS vectors. For example by creating a blob containing such a pattern a malicious user can exploit this behavior to use large amounts of CPU time on a remote server by pushing the blob and then creating a new clone with --filter=sparse:oid. However this filter type is usually disabled as it is known to consume large amounts of CPU time even without this bug. The WM_MATCH changed in the first hunk of this patch comes from the original implementation imported from rsync in 5230f605e1 (Import wildmatch from rsync, 2012-10-15). Compared to the others converted here it is fairly harmless as it only triggers at the end of the pattern and so will only cause a single unnecessary backtrack. The others introduced by 6f1a31f0aa (wildmatch: advance faster in <asterisk> + <literal> patterns, 2013-01-01) and 46983441ae (wildmatch: make a special case for "*/" with FNM_PATHNAME, 2013-01-01) are more pernicious and will cause exponential behavior. A new test is added to protect against future regressions. [1] https://research.swtch.com/glob Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24hex.h: move some hex-related declarations from cache.hElijah Newren
hex.c contains code for hex-related functions, but for some reason these functions were declared in the catch-all cache.h. Move the function declarations into a hex.h header instead. This also allows us to remove includes of cache.h from a few C files. For now, we make cache.h include hex.h, so that it is easier to review the direct changes being made by this patch. In the next patch, we will remove that, and add the necessary direct '#include "hex.h"' in the hundreds of C files that need it. Note that reviewing the header changes in this commit might be simplified via git log --no-walk -p --color-moved $COMMIT -- '*.h'` In particular, it highlights the simple movement of code in .h files rather nicely. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-29wildmatch: change behavior of "foo**bar" in WM_PATHNAME modeNguyễn Thái Ngọc Duy
In WM_PATHNAME mode (or FNM_PATHNAME), '*' does not match '/' and '**' can but only in three patterns: - '**/' matches zero or more leading directories - '/**/' matches zero or more directories in between - '/**' matches zero or more trailing directories/files When '**' is present but not in one of these patterns, the current behavior is consider the pattern invalid and stop matching. In other words, 'foo**bar' never matches anything, whatever you throw at it. This behavior is arguably a bit confusing partly because we can't really tell the user their pattern is invalid so that they can fix it. So instead, tolerate it and make '**' act like two regular '*'s (which is essentially the same as a single asterisk). This behavior seems more predictable. Noticed-by: dana <dana@dana.is> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24wildmatch: remove unused wildopts parameterÆvar Arnfjörð Bjarmason
Remove the unused wildopts placeholder struct from being passed to all wildmatch() invocations, or rather remove all the boilerplate NULL parameters. This parameter was added back in commit 9b3497cab9 ("wildmatch: rename constants and update prototype", 2013-01-01) as a placeholder for future use. Over 4 years later nothing has made use of it, let's just remove it. It can be added in the future if we find some reason to start using such a parameter. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-06typofix: assorted typofixes in comments, documentation and messagesLi Peng
Many instances of duplicate words (e.g. "the the path") and a few typoes are fixed, originally in multiple patches. wildmatch: fix duplicate words of "the" t: fix duplicate words of "output" transport-helper: fix duplicate words of "read" Git.pm: fix duplicate words of "return" path: fix duplicate words of "look" pack-protocol.txt: fix duplicate words of "the" precompose-utf8: fix typo of "sequences" split-index: fix typo worktree.c: fix typo remote-ext: fix typo utf8: fix duplicate words of "the" git-cvsserver: fix duplicate words Signed-off-by: Li Peng <lip@dtdream.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-03wildmatch: properly fold case everywhereAnthony Ramine
Case folding is not done correctly when matching against the [:upper:] character class and uppercased character ranges (e.g. A-Z). Specifically, an uppercase letter fails to match against any of them when case folding is requested because plain characters in the pattern and the whole string are preemptively lowercased to handle the base case fast. That optimization is kept and ISLOWER() is used in the [:upper:] case when case folding is requested, while matching against a character range is retried with toupper() if the character was lowercase, as the bounds of the range itself cannot be modified (in a case-insensitive context, [A-_] is not equivalent to [a-_]). Signed-off-by: Anthony Ramine <n.oxyde@gmail.com> Reviewed-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-02wildmatch: advance faster in <asterisk> + <literal> patternsNguyễn Thái Ngọc Duy
Normally when we match "*X" on "abcX", we call dowild("X", "abcX"), dowild("X", "bcX"), dowild("X", "cX") and dowild("X", "X"). Only the last call may have a chance of matching. By skipping the text before "X", we can eliminate the first three useless calls. compat, '*/*/*' on linux-2.6.git file list 2000 times, before: wildmatch 7s 985049us fnmatch 2s 735541us or 34.26% faster and after: wildmatch 4s 492549us fnmatch 0s 888263us or 19.77% slower Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-02wildmatch: make a special case for "*/" with FNM_PATHNAMENguyễn Thái Ngọc Duy
Normally we need recursion for "*". In this case we know that it matches everything until "/" so we can skip the recursion. glibc, '*/*/*' on linux-2.6.git file list 2000 times before: wildmatch 8s 74513us fnmatch 1s 97042us or 13.59% faster after: wildmatch 3s 521862us fnmatch 3s 488616us or 99.06% slower Same test with compat/fnmatch: wildmatch 8s 110763us fnmatch 2s 980845us or 36.75% faster wildmatch 3s 522156us fnmatch 1s 544487us or 43.85% slower Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-02wildmatch: support "no FNM_PATHNAME" modeNguyễn Thái Ngọc Duy
So far, wildmatch() has always honoured directory boundary and there was no way to turn it off. Make it behave more like fnmatch() by requiring all callers that want the FNM_PATHNAME behaviour to pass that in the equivalent flag WM_PATHNAME. Callers that do not specify WM_PATHNAME will get wildcards like ? and * in their patterns matched against '/', just like not passing FNM_PATHNAME to fnmatch(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-02wildmatch: make dowild() take arbitrary flagsNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-02wildmatch: rename constants and update prototypeNguyễn Thái Ngọc Duy
- All exported constants now have a prefix WM_ - Do not rely on FNM_* constants, use the WM_ counterparts - Remove TRUE and FALSE to follow Git's coding style - While at it, turn flags type from int to unsigned int - Add an (unused yet) argument to carry extra information so that we don't have to change the prototype again later when we need to pass other stuff to wildmatch Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-02wildmatch: replace variable 'special' with better named onesNguyễn Thái Ngọc Duy
'special' is too generic and is used for two different purposes. Replace it with 'match_slash' to indicate "**" pattern and 'negated' for "[!...]" and "[^...]". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-02wildmatch: fix "**" special caseNguyễn Thái Ngọc Duy
"**" is adjusted to only be effective when surrounded by slashes, in 40bbee0 (wildmatch: adjust "**" behavior - 2012-10-15). Except that the commit did it wrong: 1. when it checks for "the preceding slash unless ** is at the beginning", it compares to wrong pointer. It should have compared to the beginning of the pattern, not the text. 2. prev_p points to the character before "**", not the first "*". The correct comparison must be "prev_p < pattern" or "prev_p + 1 == pattern", not "prev_p == pattern". 3. The pattern must be surrounded by slashes unless it's at the beginning or the end of the pattern. We do two checks: one for the preceding slash and one the trailing slash. Both checks must be met. The use of "||" is wrong. This patch fixes all above. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16wildmatch: make /**/ match zero or more directoriesNguyễn Thái Ngọc Duy
"foo/**/bar" matches "foo/x/bar", "foo/x/y/bar"... but not "foo/bar". We make a special case, when foo/**/ is detected (and "foo/" part is already matched), try matching "bar" with the rest of the string. "Match one or more directories" semantics can be easily achieved using "foo/*/**/bar". This also makes "**/foo" match "foo" in addition to "x/foo", "x/y/foo".. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16wildmatch: adjust "**" behaviorNguyễn Thái Ngọc Duy
Standard wildmatch() sees consecutive asterisks as "*" that can also match slashes. But that may be hard to explain to users as "abc/**/def" can match "abcdef", "abcxyzdef", "abc/def", "abc/x/def", "abc/x/y/def"... This patch changes wildmatch so that users can do - "**/def" -> all paths ending with file/directory 'def' - "abc/**" - equivalent to "/abc/" - "abc/**/def" -> "abc/x/def", "abc/x/y/def"... - otherwise consider the pattern malformed if "**" is found Basically the magic of "**" only remains if it's wrapped around by slashes. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16wildmatch: fix case-insensitive matchingNguyễn Thái Ngọc Duy
dowild() does case insensitive matching by lower-casing the text. That means lower case letters in patterns imply case-insensitive matching, but upper case means exact matching. We do not want that subtlety. Lower case pattern too so iwildmatch() always does what we expect it to do. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16wildmatch: remove static variable force_lower_caseNguyễn Thái Ngọc Duy
One place less to worry about thread safety. Also combine wildmatch and iwildmatch into one. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16wildmatch: make wildmatch's return value compatible with fnmatchNguyễn Thái Ngọc Duy
wildmatch returns non-zero if matched, zero otherwise. This patch makes it return zero if matches, non-zero otherwise, like fnmatch(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16Integrate wildmatch to gitNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16wildmatch: follow Git's coding conventionNguyễn Thái Ngọc Duy
wildmatch's coding style is pretty close to Git's except the use of 4 space indentation instead of 8. This patch should produce empty diff with "git diff -b" Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16wildmatch: remove unnecessary functionsNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16Import wildmatch from rsyncNguyễn Thái Ngọc Duy
These files are from rsync.git commit f92f5b166e3019db42bc7fe1aa2f1a9178cd215d, which was the last commit before rsync turned GPL-3. All files are imported as-is and no-op. Adaptation is done in a separate patch. rsync.git -> git.git lib/wildmatch.[ch] wildmatch.[ch] wildtest.txt t/t3070/wildtest.txt Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>