github.com/stanfordnlp/stanza.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2021-08-23	Update trainer.pywordinput-sentsegmenter	Gordon
2021-08-23	Update utils.py	Gordon
2021-08-23	Update vocab.py	Gordon
2021-08-23	Update model.py	Gordon
2021-08-23	Update data.py	Gordon
2021-08-12	Fix some whitespace	John Bauer
2021-08-10	includes TEST_100K as test set for BEST eval	Gordon
2021-08-10	includes TEST_100K as test set for BEST eval	Gordon
2021-08-10	Updated BEST to include TEST_100K	Gordon
2021-08-10	Open/close files in a context to guarantee handles are closed	John Bauer
2021-08-10	Dictionary redo (#776)	vythaihn
2021-08-04	Create thai_syllable_dict_generator.py	Gordon
2021-08-03	Also check if the test set has tags not present in the tagger or if the train...	John Bauer
2021-08-03	Add a test to see if any tags are in the dev set but not the train set	John Bauer
2021-08-03	Change file not found to an error	John Bauer
2021-08-02	Add two new NER models to the resources	John Bauer
2021-07-31	This test was backwards, causing a bunch of stray java processes when a conte...	John Bauer
2021-07-31	skip langid tests until resources set up	J38
2021-07-29	Update word embedding to the dimension in the file when creating a new model	John Bauer
2021-07-28	Double check that the length of the processors list is 2 when adding just tok...	John Bauer
2021-07-28	Add mwt if tokenize is passed without MWT (#777)	David Riff
2021-07-27	Add vlsp pos dataset option for VLSP WS task (#772)	vythaihn
2021-07-25	Add some explanation to the logging output for the NER scores	John Bauer
2021-07-25	Add processing for it_fbk. Uses the .tsv file they sent us and their recomme...	John Bauer
2021-07-25	Add the ability for the ner model to upscale basic (no B- or I-) tagging -> B...	John Bauer
2021-07-25	Add a processing step for NHCLT datasets. Currently Afrikaans is the most us...	John Bauer
2021-07-25	Make the matrix more readable when there are a ton of categories	John Bauer
2021-07-25	Format ints differently from floats in the confusion matrix	John Bauer
2021-07-25	Add a confusion matrix over tokens to the output of the ner_tagger	John Bauer
2021-07-25	Add a flag for finetuning from a different load name from the save_name	John Bauer
2021-07-25	If given an empty list, simply return an empty list when sort is called. Fix...	John Bauer
2021-07-24	Merge pull request #766 from stanfordnlp/thai_lst20_redo	John Bauer
2021-07-23	Add a test of empty text for the pipeline	John Bauer
2021-07-23	Add indentation to the json rather than saving it in one large dump	John Bauer
2021-07-23	Fix command line for hindi datasets	John Bauer
2021-07-23	Process gz files as well as .txt and .txt.xz in the charlm	John Bauer
2021-07-22	Adjust orchid preparation script to always include spaces after sentences	John Bauer
2021-07-22	Add a test which checks that the orchid results are consistent	John Bauer
2021-07-22	Add a longer test for a couple different variations on processing text	John Bauer
2021-07-22	Add an option to split clauses into sentences if a space is between clauses	John Bauer
2021-07-22	Add more notes on how the tokenization boundaries are determined	John Bauer
2021-07-22	Add an option to add spaces after the sentence ends (which is actually more c...	John Bauer
2021-07-22	Add a lot of notes on how the characters are expected to line up in the test	John Bauer
2021-07-21	Attempt to add a helpful error explaining where it looked for LST20	John Bauer
2021-07-21	Add a tiny test for part of the LST20 preparation	John Bauer
2021-07-21	Make the retokenization an option for the lst20 dataset	John Bauer
2021-07-21	Use pythainlp to resplit lst20 sentences as well	John Bauer
2021-07-21	Refactor some of the processing code which uses pythainlp	John Bauer
2021-07-21	Revert "Adjust the newpar title"	John Bauer
2021-07-20	Standardize the final short_name of the hindi ner dataset regardless of which...	John Bauer