diff options
author | Wilco Dijkstra <Wilco.Dijkstra@arm.com> | 2018-12-31 21:01:52 +0300 |
---|---|---|
committer | Eric Blake <eblake@redhat.com> | 2019-01-01 18:44:59 +0300 |
commit | 353ebae30465606bd8b15bea1194f7a2881903fb (patch) | |
tree | 7ecab382557ac2042d03a91fe3e1ead3b51e94bd /newlib/libc/string/str-two-way.h | |
parent | 572687310059534b2da9428ca19df992509c8a5d (diff) |
Improve performance of memmem
This patch significantly improves performance of memmem using a novel
modified Horspool algorithm. Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.
By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.
Small needles up to size 2 use a dedicated linear search. Very long needles
use the Two-Way algorithm (to avoid increasing stack size inlining is now disabled).
The performance gain is 6.6 times on English text on AArch64 using random
needles with average size 8 (this is even faster than the recently improved strstr
algorithm, so I'll update that in the near future).
The size-optimized memmem has also been rewritten from scratch to get a
2.7x performance gain.
Tested against GLIBC testsuite and randomized tests.
Message-Id: <DB5PR08MB1030649D051FA8532A4512C883B20@DB5PR08MB1030.eurprd08.prod.outlook.com>
Diffstat (limited to 'newlib/libc/string/str-two-way.h')
-rw-r--r-- | newlib/libc/string/str-two-way.h | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/newlib/libc/string/str-two-way.h b/newlib/libc/string/str-two-way.h index 416f9d29f..90345a8de 100644 --- a/newlib/libc/string/str-two-way.h +++ b/newlib/libc/string/str-two-way.h @@ -31,6 +31,7 @@ #include <limits.h> #include <stdint.h> +#include <_ansi.h> /* We use the Two-Way string matching algorithm, which guarantees linear complexity with constant space. Additionally, for long @@ -288,7 +289,7 @@ two_way_short_needle (const unsigned char *haystack, size_t haystack_len, If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 * HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching, and sublinear performance is not possible. */ -static RETURN_TYPE +_NOINLINE_STATIC RETURN_TYPE two_way_long_needle (const unsigned char *haystack, size_t haystack_len, const unsigned char *needle, size_t needle_len) { |