Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/stanfordnlp/stanza.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-09-06Sort subfolders so that results are reproducibleJohn Bauer
2022-09-06Import either lxml or ElementTree. ElementTree is slower, but doesn't ↵John Bauer
require an additional module dependency
2022-09-06NAMESPACES -> NAMESPACE, replace all xpath with findallJohn Bauer
2022-09-06Mostly pl_ner conversion test - tests the conversion of the XML so farJohn Bauer
2022-09-06Separate out a smaller piece of the extraction function in convert_nkjpJohn Bauer
2022-09-04Replace click with argparse in the Polish NER, rather than adding a new ↵John Bauer
library dependency
2022-09-04Remove global variable usage by passing it around everywhere insteadJohn Bauer
2022-09-04NER Polish (#1110)Karol Saputa
* Add NER dataset for Polish Co-authored-by: ryszardtuora <ryszardtuora@gmail.com> Co-authored-by: Karol Saputa <ksaputa@gputrain.dariah.ipipan.waw.pl> This PR adds Polish NER dataset #1070
2022-09-02Remove some unnecessary list creation. Rather than using shutil, read then ↵John Bauer
write sentences so that we can later manipulate the sentences as needed in write_sentences
2022-09-02Add some more notes on bilstm size experiments in the classifierJohn Bauer
2022-09-01Update compose_ete_results.py to allow multiple input filesJohn Bauer
2022-09-01no enhanced dependenciesJohn Bauer
2022-09-01Integrate the newer eval.py from udtools in place of the previously existing ↵John Bauer
conll18 version
2022-09-01More informative errors if the data can't be foundJohn Bauer
2022-09-01Update Hebrew default to a combined modelJohn Bauer
2022-09-01Add the capacity to build he_combined models from UD_Hebrew-IAHLTwiki and a ↵hebrew_combinedJohn Bauer
fork of HTB. Addresses #1109
2022-09-01Allow for a list of NER models in the processors argument, similar to the ↵John Bauer
list of NER models in the package argument when creating a pipeline. From @mpenalver in #928
2022-09-01Output the class chosen if choosing an xpos factory from scratchJohn Bauer
2022-09-01A couple more experiment notesJohn Bauer
2022-08-31Don't save optimizers for the non-checkpoints (and fix a save bug for the ↵John Bauer
end of epoch save)
2022-08-31Make saved models smaller in the classifier test. Will hopefully save disk ↵John Bauer
space and time
2022-08-31notes on the madgrad LR experimentsJohn Bauer
2022-08-31Update a couple defaults based on recent experimentsJohn Bauer
2022-08-31Save the best score when training a model so that future training from a ↵John Bauer
checkpoint knows when to save a better model
2022-08-31Update to 0.0005 - less likely to go completely badJohn Bauer
2022-08-31Oops, correct a few uses of model in the classifier main programJohn Bauer
2022-08-31Save checkpoints with epochs_trained+1 at the end of an epoch (otherwise the ↵John Bauer
epoch will not be incremented properly when reloading)
2022-08-31Add a checkpoint mechanism to sentimentJohn Bauer
pass checkpoint_file to train_model in the unittest, but TODO: need to add tests for checkpointing
2022-08-31Simplify the load mechanism in classifier Trainer so that the load() call ↵John Bauer
loads the pretrain, charlm, etc
2022-08-31Refactor a Trainer object out of the classifier.py main program. In ↵sentiment_trainerJohn Bauer
addition to the model, this saves and loads the optimizer and the number of epochs trained. Purpose: to make it so that it is easy to checkpoint model training the same way the charlm is checkpointed
2022-08-31Refactor a bunch of data manipulation methods to data.pyJohn Bauer
2022-08-31This should be a RuntimeError, not a generic ExceptionJohn Bauer
2022-08-31Refactor building an optimizerJohn Bauer
2022-08-31Move the loss function to the device w/o a call to cuda()John Bauer
2022-08-31Rearrange all classifier args to be part of classifier.pyJohn Bauer
2022-08-31Refactor building the argparse for classifierJohn Bauer
2022-08-31recommend madgrad for default optimizerJohn Bauer
2022-08-30Add support for elmoformanylangs to sentimentJohn Bauer
Includes a matrix trained to connect the 3 layers of elmo instead of using the default averaging Also, a projection from elmo dim to a lower dimension (although this was less useful) Add a comment on how the sentiment processor doesn't load Elmo. Actually, in general this integration is unlikely to be used for much, but there's also no specific reason to throw this code away.
2022-08-30More details log lines when saving a new best model in the classifierJohn Bauer
2022-08-30Add madgrad as an option, although none of the parameters have been CVed for ↵John Bauer
sentiment yet
2022-08-27move the HiNER dataset to under hindi/John Bauer
2022-08-27log weighted F1, specifically for comparison to the HiNER datasetJohn Bauer
2022-08-27Add a weighted f1 to the confusion matrix calculationsJohn Bauer
2022-08-27simple test of macro f1 result in confusion matrixJohn Bauer
2022-08-27Flesh out the errors a bit & add a line for '...' when we integrate the ↵John Bauer
latest version
2022-08-26Can bring the it_vit test back to life now that corenlp 4.5.0 is outJohn Bauer
2022-08-26Grad clipping in the constituency parser. Not a clear benefit yet, so just ↵John Bauer
leaving this as an option.
2022-08-26Restart training with the pieces inside a constituency model rather than ↵John Bauer
using the dataset. Works better for smaller items... need to fix larger dataset as well
2022-08-26Fix run_ner.py after the earlier refactoringJohn Bauer
2022-08-25Oops, need to increment this line_idx as wellJohn Bauer