Welcome to
mirror list
, hosted at
ThFree Co
, Russian Federation.
github.com/stanfordnlp/stanza.git - Unnamed repository; edit this file 'description' to name the repository.
index
:
github.com/stanfordnlp/stanza.git
142_resources_patch
NFC
am-notes
aws_sagemaker_tooling
azshan
beam
bert_mix
charlm_cache
charlm_checkpoint
con_add_pattn
con_attn2
con_bigrams
con_checkpoint
con_classifier
con_classifier_reranking
con_focal
con_freeze
con_kbest
con_lattn
con_lattn2
con_mixed_pattn
con_mlp
con_mlp_inputs
con_multitask
con_pattn_lr
con_pattn_replace
con_restart_transitions
con_self_gan
con_shift_tags
con_shift_tags2
con_shift_transitions
con_simple_transformer
con_simple_unary
con_skip
con_trans
con_tree_lstm
con_tree_lstm2
con_tree_lstm3
con_tstack
con_vector_dropout
con_vit
con_warmup
con_warmup_2
con_warmup_lattn
dataloader_local
de_ner
dev
elmo2
elmo_many
fewer_cuda
fix_unit_tests
gh-pages
gh-pages-sent
hebrew_combined
hi-layered-ner
hi-ner-cleaned
hi-shuffle
hi_ner
hi_ner_final
inorder_unary
kazakh_ner
kk_trans
lattn_issue
m1
main
marathi
margin_penalty
masakhane
masks
masks2
masks3
ner_bert
ner_bert_copy
ner_wv
ninf_langid
nner
no_header_pt
numeric_re
ordered_dict
pattn_issue
pos_bert
pos_charlm
ppf_data
pydataloader
refactor_dataloader
refactor_lstm
refactor_tok
refactor_tokenizer
refactor_tokkenizer_2
runner-demo
semgrex_search_visualization
sentence_ids
sentiment
sentiment_charlm
sentiment_lstm
sentiment_trainer
sindhi
spanish_sent
t5
t5b
tagger_mha
thai-sybrnn
thai_ner
tiny_ud
token
tokens
tr_ner
trans_lm
tweet
ug_ner
update_stanza
updated_eval
updated_eval_2
vi_bert_last
visualization
wandb
word_lstm_pattn
wordinput-sentsegmenter
xpos
Unnamed repository; edit this file 'description' to name the repository.
www-data
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2021-07-28
i
hi-ner-cleaned
anwesham-lab
2021-07-28
Update .gitignore
anwesham-lab
2021-07-27
Add vlsp pos dataset option for VLSP WS task (#772)
vythaihn
2021-07-25
Add some explanation to the logging output for the NER scores
John Bauer
2021-07-25
Add processing for it_fbk. Uses the .tsv file they sent us and their recomme...
John Bauer
2021-07-25
Add the ability for the ner model to upscale basic (no B- or I-) tagging -> B...
John Bauer
2021-07-25
Add a processing step for NHCLT datasets. Currently Afrikaans is the most us...
John Bauer
2021-07-25
Make the matrix more readable when there are a ton of categories
John Bauer
2021-07-25
Format ints differently from floats in the confusion matrix
John Bauer
2021-07-25
Add a confusion matrix over tokens to the output of the ner_tagger
John Bauer
2021-07-25
Add a flag for finetuning from a different load name from the save_name
John Bauer
2021-07-25
If given an empty list, simply return an empty list when sort is called. Fix...
John Bauer
2021-07-24
Merge pull request #766 from stanfordnlp/thai_lst20_redo
John Bauer
2021-07-23
Add a test of empty text for the pipeline
John Bauer
2021-07-23
Add indentation to the json rather than saving it in one large dump
John Bauer
2021-07-23
Fix command line for hindi datasets
John Bauer
2021-07-23
Process gz files as well as .txt and .txt.xz in the charlm
John Bauer
2021-07-22
Adjust orchid preparation script to always include spaces after sentences
John Bauer
2021-07-22
Add a test which checks that the orchid results are consistent
John Bauer
2021-07-22
Add a longer test for a couple different variations on processing text
John Bauer
2021-07-22
Add an option to split clauses into sentences if a space is between clauses
John Bauer
2021-07-22
Add more notes on how the tokenization boundaries are determined
John Bauer
2021-07-22
Add an option to add spaces after the sentence ends (which is actually more c...
John Bauer
2021-07-22
Add a lot of notes on how the characters are expected to line up in the test
John Bauer
2021-07-21
Attempt to add a helpful error explaining where it looked for LST20
John Bauer
2021-07-21
Add a tiny test for part of the LST20 preparation
John Bauer
2021-07-21
Make the retokenization an option for the lst20 dataset
John Bauer
2021-07-21
Use pythainlp to resplit lst20 sentences as well
John Bauer
2021-07-21
Refactor some of the processing code which uses pythainlp
John Bauer
2021-07-21
Revert "Adjust the newpar title"
John Bauer
2021-07-20
Standardize the final short_name of the hindi ner dataset regardless of which...
John Bauer
2021-07-20
Add a few extra cases to treebank_to_short_name so that calling on an already...
John Bauer
2021-07-19
Add some more command lines to the prepare_ner_dataset.py doc
John Bauer
2021-07-19
Improve prepare_ner_dataset doc
John Bauer
2021-07-19
Merge pull request #765 from stanfordnlp/thai
John Bauer
2021-07-19
Refactor some to make it easier to test the lst20 script
John Bauer
2021-07-19
Don't make new text files for datasets which already produced text files
John Bauer
2021-07-19
Integrate lst20 into the tokenization script
John Bauer
2021-07-19
Move process_lst20 to tokenization
John Bauer
2021-07-19
Add a script which converts the LST20 dataset for tokenization
John Bauer
2021-07-19
Test updates based on changes to the underlying data, which changed the resul...
John Bauer
2021-07-19
Test updates based on changes to the underlying data, which changed the resul...
John Bauer
2021-07-17
Merge pull request #764 from stanfordnlp/move_orchid
John Bauer
2021-07-17
Problem with space separation
John Bauer
2021-07-17
Add NewPar to new paragraphs.
John Bauer
2021-07-17
Connect BEST to the conversion script
John Bauer
2021-07-17
Add th_orchid to the prepare_tokenizer_treebank script
John Bauer
2021-07-17
Move thai orchid & best tokenization to the tokenization specific directory
John Bauer
2021-07-16
Merge pull request #763 from stanfordnlp/vlsp_tokenizer
John Bauer
2021-07-16
Adjust the newpar title
John Bauer
[next]