github.com/stanfordnlp/stanza.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2022-09-06	Sort subfolders so that results are reproducible	John Bauer

2022-09-06	Import either lxml or ElementTree. ElementTree is slower, but doesn't ↵	John Bauer
	require an additional module dependency
2022-09-06	NAMESPACES -> NAMESPACE, replace all xpath with findall	John Bauer

2022-09-06	Mostly pl_ner conversion test - tests the conversion of the XML so far	John Bauer

2022-09-06	Separate out a smaller piece of the extraction function in convert_nkjp	John Bauer

2022-09-04	Replace click with argparse in the Polish NER, rather than adding a new ↵	John Bauer
	library dependency
2022-09-04	Remove global variable usage by passing it around everywhere instead	John Bauer

2022-09-04	NER Polish (#1110)	Karol Saputa
	* Add NER dataset for Polish Co-authored-by: ryszardtuora <ryszardtuora@gmail.com> Co-authored-by: Karol Saputa <ksaputa@gputrain.dariah.ipipan.waw.pl> This PR adds Polish NER dataset #1070
2022-09-02	Remove some unnecessary list creation. Rather than using shutil, read then ↵	John Bauer
	write sentences so that we can later manipulate the sentences as needed in write_sentences
2022-09-02	Add some more notes on bilstm size experiments in the classifier	John Bauer

2022-09-01	Update compose_ete_results.py to allow multiple input files	John Bauer

2022-09-01	no enhanced dependencies	John Bauer

2022-09-01	Integrate the newer eval.py from udtools in place of the previously existing ↵	John Bauer
	conll18 version
2022-09-01	More informative errors if the data can't be found	John Bauer

2022-09-01	Update Hebrew default to a combined model	John Bauer

2022-09-01	Add the capacity to build he_combined models from UD_Hebrew-IAHLTwiki and a ↵hebrew_combined	John Bauer
	fork of HTB. Addresses #1109
2022-09-01	Allow for a list of NER models in the processors argument, similar to the ↵	John Bauer
	list of NER models in the package argument when creating a pipeline. From @mpenalver in #928
2022-09-01	Output the class chosen if choosing an xpos factory from scratch	John Bauer

2022-09-01	A couple more experiment notes	John Bauer

2022-08-31	Don't save optimizers for the non-checkpoints (and fix a save bug for the ↵	John Bauer
	end of epoch save)
2022-08-31	Make saved models smaller in the classifier test. Will hopefully save disk ↵	John Bauer
	space and time
2022-08-31	notes on the madgrad LR experiments	John Bauer

2022-08-31	Update a couple defaults based on recent experiments	John Bauer

2022-08-31	Save the best score when training a model so that future training from a ↵	John Bauer
	checkpoint knows when to save a better model
2022-08-31	Update to 0.0005 - less likely to go completely bad	John Bauer

2022-08-31	Oops, correct a few uses of model in the classifier main program	John Bauer

2022-08-31	Save checkpoints with epochs_trained+1 at the end of an epoch (otherwise the ↵	John Bauer
	epoch will not be incremented properly when reloading)
2022-08-31	Add a checkpoint mechanism to sentiment	John Bauer
	pass checkpoint_file to train_model in the unittest, but TODO: need to add tests for checkpointing
2022-08-31	Simplify the load mechanism in classifier Trainer so that the load() call ↵	John Bauer
	loads the pretrain, charlm, etc
2022-08-31	Refactor a Trainer object out of the classifier.py main program. In ↵sentiment_trainer	John Bauer
	addition to the model, this saves and loads the optimizer and the number of epochs trained. Purpose: to make it so that it is easy to checkpoint model training the same way the charlm is checkpointed
2022-08-31	Refactor a bunch of data manipulation methods to data.py	John Bauer

2022-08-31	This should be a RuntimeError, not a generic Exception	John Bauer

2022-08-31	Refactor building an optimizer	John Bauer

2022-08-31	Move the loss function to the device w/o a call to cuda()	John Bauer

2022-08-31	Rearrange all classifier args to be part of classifier.py	John Bauer

2022-08-31	Refactor building the argparse for classifier	John Bauer

2022-08-31	recommend madgrad for default optimizer	John Bauer

2022-08-30	Add support for elmoformanylangs to sentiment	John Bauer
	Includes a matrix trained to connect the 3 layers of elmo instead of using the default averaging Also, a projection from elmo dim to a lower dimension (although this was less useful) Add a comment on how the sentiment processor doesn't load Elmo. Actually, in general this integration is unlikely to be used for much, but there's also no specific reason to throw this code away.
2022-08-30	More details log lines when saving a new best model in the classifier	John Bauer

2022-08-30	Add madgrad as an option, although none of the parameters have been CVed for ↵	John Bauer
	sentiment yet
2022-08-27	move the HiNER dataset to under hindi/	John Bauer

2022-08-27	log weighted F1, specifically for comparison to the HiNER dataset	John Bauer

2022-08-27	Add a weighted f1 to the confusion matrix calculations	John Bauer

2022-08-27	simple test of macro f1 result in confusion matrix	John Bauer

2022-08-27	Flesh out the errors a bit & add a line for '...' when we integrate the ↵	John Bauer
	latest version
2022-08-26	Can bring the it_vit test back to life now that corenlp 4.5.0 is out	John Bauer

2022-08-26	Grad clipping in the constituency parser. Not a clear benefit yet, so just ↵	John Bauer
	leaving this as an option.
2022-08-26	Restart training with the pieces inside a constituency model rather than ↵	John Bauer
	using the dataset. Works better for smaller items... need to fix larger dataset as well
2022-08-26	Fix run_ner.py after the earlier refactoring	John Bauer

2022-08-25	Oops, need to increment this line_idx as well	John Bauer