Age | Commit message (Collapse) | Author |
|
|
|
unnecessarily long items get dinged
|
|
|
|
|
|
|
|
|
|
very small parameters to make it time & memory efficient
checks that loaded model params are the same, check that results are the same
|
|
Split up large batches of train data to hopefully avoid OOM errors
Move criterion to the model - will allow for using it for scoring
Add a score method using the criterion
Add a load method
score a single sentence as if it were a list of sentences
|
|
sort by length
Batchify the sentences without flattening them
(the ppl is no longer useful with the new data)
Pass in padding masks
Add an argparser for the configuration
Refactor a bunch of methods
Predict the test loss on the parsed test set... not happy
process a dev pred file too
Save model (no config yet)
|
|
saving easier
|
|
because that was originally a jupyter notebook
https://pytorch.org/tutorials/beginner/transformer_tutorial.html
|
|
since those are likely to be useless
|
|
|
|
future changes
|
|
path if the lattn_partitioned default changes
|
|
|
|
|
|
|
|
|
|
even if that means some overlaps)
|
|
|
|
|
|
|
|
|
|
add a test of exactly that functionality
|
|
|
|
silver trees, since adding a silver treebank makes an epoch take twice as long
|
|
Mostly untested. Includes an unfinished test of the silver data
|
|
to generalize selftrain.py to use Ensemble as well
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
relevant (especially IT)
|
|
|
|
|
|
|
|
save on memory
|
|
train for vlsp22
|
|
|
|
run_constituency and/or refactor some of the methods in Ensemble
Error check the mixture of models
Defaults currently set to English
|
|
|
|
Unfortunately, the lengths are causing a problem when just adding
scores for the purposes of a reranker
However, this seems to work for ensembling multiple models together
add a keep_scores flag to parse_sentences etc
|
|
|
|
|