diff options
author | alvations <alvations@gmail.com> | 2019-10-01 00:27:06 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2019-10-01 00:27:06 +0300 |
commit | 555829a771cd897bb807f495a95737953a7ca9a3 (patch) | |
tree | 0b211aeb4ab0b0da765150e53955ea17781b6ffd | |
parent | 01a8ec41e835b5e9b1b7f7b82a8d49769a354d6d (diff) |
Undoing 05788925812f0d3265e355565cbb1701a0ad7510
Causes abbreviations to not split when ending with a fullstop. E.g.
> The restructuring of IBM was essential to enable it organisationally to take up the responsibilities entrusted in the role with the recent changes in the policy and legislations, revised charter of function of IBM and the new activities and initiatives undertaken by IBM. IBM is also engaged in handholding the States for auction of mineral blocks for greater transparency in allocation of mineral concessions.
-rwxr-xr-x | scripts/ems/support/split-sentences.perl | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/scripts/ems/support/split-sentences.perl b/scripts/ems/support/split-sentences.perl index 2c2319a12..f3494bc88 100755 --- a/scripts/ems/support/split-sentences.perl +++ b/scripts/ems/support/split-sentences.perl @@ -193,7 +193,7 @@ sub preprocess { my $starting_punct = $2; if ($prefix && $NONBREAKING_PREFIX{$prefix} && $NONBREAKING_PREFIX{$prefix} == 1 && !$starting_punct) { # Not breaking; - } elsif ($words[$i] =~ /(\.?)[\p{IsUpper}\-]+(\.+)$/) { + } elsif ($words[$i] =~ /(\.)[\p{IsUpper}\-]+(\.+)$/) { # Not breaking - upper case acronym } elsif($words[$i+1] =~ /^([ ]*[\'\"\(\[\¿\¡\p{IsPi}]*[ ]*[\p{IsUpper}0-9])/) { # The next word has a bunch of initial quotes, maybe a |