Age | Commit message (Expand) | Author |
2022-10-29 | refactor predict dir,file,format args so they can be used elsewhere if needed | John Bauer |
2022-10-29 | Refactor an unnecessary duplication of arguments | John Bauer |
2022-10-29 | Add functionality to turn a tokenized text file into a file of parse trees | John Bauer |
2022-10-29 | ignore em dashes in Wikipedia, as that seems to be lists | John Bauer |
2022-10-29 | Add a useful doc on how to build batches from tagged words | John Bauer |
2022-10-29 | Use reasonable defaults for EN and VI ensembles. Can add other languages as ... | John Bauer |
2022-10-29 | Oops, logger was missing in the retagging.py module | John Bauer |
2022-10-29 | A script for tokenizing a Wikipedia file and writing it out | John Bauer |
2022-10-28 | Accept a single file for wiki processing in selftrain_wiki.py | John Bauer |
2022-10-28 | fix bug in the lt/gt finding (it can start a line). use FoundationCache to s... | John Bauer |
2022-10-28 | Add a rotation to make N non-overlapping dev sets with the remainder being tr... | John Bauer |
2022-10-28 | Simplify reading & writing loop. Will make it easier to 'rotate' the dev set | John Bauer |
2022-10-28 | Add a prototype of model emsembling. Better would be to integrate it with ru... | John Bauer |
2022-10-28 | Refactor the retagging args & pipeline creation into a separate modeule | John Bauer |
2022-10-28 | Keep scores when parsing a block of sentences. | John Bauer |
2022-10-28 | fix a comment | John Bauer |
2022-10-27 | Order the pretrains so that resource files are made with a consistent md5sum | John Bauer |
2022-10-27 | Add an extraction for bartpho | John Bauer |
2022-10-27 | Sort files when converting VLSP22 so that output is the same across platforms | John Bauer |
2022-10-27 | Adjust attention masks for vi phobert | John Bauer |
2022-10-27 | add a line for using phobert on the vlsp tagger dataset | John Bauer |
2022-10-26 | Existing POS models with bert would be broken when looking at the args, so in... | John Bauer |
2022-10-26 | Fix retag package when using relabel_tags | John Bauer |
2022-10-26 | mix N layers of transformer when adding them to the POS inputs | John Bauer |
2022-10-26 | Mark this test with travis - not sure that is still relevant | John Bauer |
2022-10-25 | Add a --predict_format option which will allow the user to specify how to wri... | John Bauer |
2022-10-25 | By default, use _ to separate spaces when converting pase trees to LM | John Bauer |
2022-10-24 | Return 2 values if the dev/test set is empty | John Bauer |
2022-10-24 | Sort VLSP filenames to avoid cross-platform weirdness | John Bauer |
2022-10-24 | batch -> training_batch or current_batch as relevant | John Bauer |
2022-10-24 | Add a hopefully useful doc | John Bauer |
2022-10-23 | Add a trivial parse_tagged_words test | John Bauer |
2022-10-23 | By default, turn off pattn & lattn (at least until we figure out how to extra... | John Bauer |
2022-10-23 | Notes on default bert model. More hidden layers for VI bert by default | John Bauer |
2022-10-22 | Add a tool which connects to the ProcessMorphologyRequest in CoreNLP (not rel... | John Bauer |
2022-10-22 | Update corenlp.proto with definitions that will connect to the Morphology ann... | John Bauer |
2022-10-22 | Process a constituency treebank into a POS dataset. Note that spacing, lemma... | John Bauer |
2022-10-22 | Print trees in the output format VLSP expects for the bakeoff | John Bauer |
2022-10-22 | Fully specify format to avoid warning, test the basic output as well | John Bauer |
2022-10-22 | Option to convert LBKT to -LRB-, RBKT to -RRB- | John Bauer |
2022-10-21 | Clean up use of full_results vs treebank | John Bauer |
2022-10-21 | fix --num_generate bug | John Bauer |
2022-10-21 | Apparently this one detach().cpu() call speeds up the program by 5% | John Bauer |
2022-10-21 | Add an option to the vlsp22 processing to load from a different path | John Bauer |
2022-10-21 | Reorganize prepare_con_dataset to use the same function header for all functi... | John Bauer |
2022-10-21 | Count a couple other stray tags as errors - they are all now fixed in the 202... | John Bauer |
2022-10-20 | A couple upgrades to the IT silver script and the silver scripts in general -... | John Bauer |
2022-10-20 | Disallow a bunch more constituents - although now the 2022 dataset is mostly ... | John Bauer |
2022-10-20 | Fix a couple trees which were weirdly labeled | John Bauer |
2022-10-20 | Don't process .zip files when doing vlsp-22 | John Bauer |