Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/stanfordnlp/stanza.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJohn Bauer <horatio@gmail.com>2022-10-28 19:14:00 +0300
committerJohn Bauer <horatio@gmail.com>2022-10-28 19:14:00 +0300
commitb19a106559594f85a8c8519552ec3a1c205ca844 (patch)
tree204fc4aa224d950cc64313846c03d99958f9da7f
parentba5c8c7fa3f001a034f7095468a10c0f920c2dac (diff)
Accept a single file for wiki processing in selftrain_wiki.py
-rw-r--r--stanza/utils/datasets/constituency/selftrain_wiki.py3
1 files changed, 3 insertions, 0 deletions
diff --git a/stanza/utils/datasets/constituency/selftrain_wiki.py b/stanza/utils/datasets/constituency/selftrain_wiki.py
index 6c2604fe..01692da4 100644
--- a/stanza/utils/datasets/constituency/selftrain_wiki.py
+++ b/stanza/utils/datasets/constituency/selftrain_wiki.py
@@ -50,6 +50,9 @@ def list_wikipedia_files(input_dir):
Recursively traverse the directory, then sort
"""
+ if not os.path.isdir(input_dir) and os.path.split(input_dir)[1].startswith("wiki_"):
+ return [input_dir]
+
wiki_files = []
recursive_files = deque()