Age | Commit message (Expand) | Author |
2022-11-03 | Add a flag to control how many tags to use when labeling shift transitionscon_shift_tags2 | John Bauer |
2022-11-03 | Move convert_trees back to transition_sequence | John Bauer |
2022-11-03 | Add a tag label to a Shift | John Bauer |
2022-11-03 | Tag dropout - turn X% of tags into UNK on each training loop | John Bauer |
2022-11-03 | Throw out wiki docs of length 2 as well when building a silver dataset | John Bauer |
2022-11-03 | Add min_len and max_len args to tokenize_wiki.py. Skip one line wiki docs, s... | John Bauer |
2022-11-02 | Fix format error in log line | John Bauer |
2022-11-01 | slice in a more generic manner when copying model. makes it easier to make f... | John Bauer |
2022-11-01 | Set this option in the partitioned test so that it still tests this code path... | John Bauer |
2022-11-01 | lattn_partitioned == False should affect the input proj dimension as well | John Bauer |
2022-11-01 | Add an argument for partitioning / not partitioning lattn | John Bauer |
2022-11-01 | Oops, this was incorrect | John Bauer |
2022-11-01 | Log some stats after all models are created for training (move the log line) | John Bauer |
2022-11-01 | Use some words from the silver dataset (currently |gold| words are added, eve... | John Bauer |
2022-10-31 | Add a suffix argument to the renormalize script | John Bauer |
2022-10-31 | Script to renormalize Vietnamese diacritics | John Bauer |
2022-10-30 | Add a separate argument for --silver_epoch_size, just in case people want that | John Bauer |
2022-10-30 | Add notes on silver words for the delta embedding | John Bauer |
2022-10-30 | Since we just ran into a bug where checkpoints were not correctly loaded, add... | John Bauer |
2022-10-30 | update comment | John Bauer |
2022-10-30 | Track how many batches a model gets trained for. Backdoor test for the silve... | John Bauer |
2022-10-30 | Rough draft of using silver trees. | John Bauer |
2022-10-29 | Move uses_xpos() to the model itself, add it Ensemble. Will make it easier t... | John Bauer |
2022-10-29 | Try smaller chunks for the parse_text. One giant chunk ran out of GPU | John Bauer |
2022-10-29 | Add a couple hopefully helpful log lines to the parse_text operation | John Bauer |
2022-10-29 | Connect model ensembles to the predict_text functionality | John Bauer |
2022-10-29 | oops, model was supposed to be set to eval() when run in predict_text mode | John Bauer |
2022-10-29 | refactor predict dir,file,format args so they can be used elsewhere if needed | John Bauer |
2022-10-29 | Refactor an unnecessary duplication of arguments | John Bauer |
2022-10-29 | Add functionality to turn a tokenized text file into a file of parse trees | John Bauer |
2022-10-29 | ignore em dashes in Wikipedia, as that seems to be lists | John Bauer |
2022-10-29 | Add a useful doc on how to build batches from tagged words | John Bauer |
2022-10-29 | Use reasonable defaults for EN and VI ensembles. Can add other languages as ... | John Bauer |
2022-10-29 | Oops, logger was missing in the retagging.py module | John Bauer |
2022-10-29 | A script for tokenizing a Wikipedia file and writing it out | John Bauer |
2022-10-28 | Accept a single file for wiki processing in selftrain_wiki.py | John Bauer |
2022-10-28 | fix bug in the lt/gt finding (it can start a line). use FoundationCache to s... | John Bauer |
2022-10-28 | Add a rotation to make N non-overlapping dev sets with the remainder being tr... | John Bauer |
2022-10-28 | Simplify reading & writing loop. Will make it easier to 'rotate' the dev set | John Bauer |
2022-10-28 | Add a prototype of model emsembling. Better would be to integrate it with ru... | John Bauer |
2022-10-28 | Refactor the retagging args & pipeline creation into a separate modeule | John Bauer |
2022-10-28 | Keep scores when parsing a block of sentences. | John Bauer |
2022-10-28 | fix a comment | John Bauer |
2022-10-27 | Order the pretrains so that resource files are made with a consistent md5sum | John Bauer |
2022-10-27 | Add an extraction for bartpho | John Bauer |
2022-10-27 | Sort files when converting VLSP22 so that output is the same across platforms | John Bauer |
2022-10-27 | Adjust attention masks for vi phobert | John Bauer |
2022-10-27 | add a line for using phobert on the vlsp tagger dataset | John Bauer |
2022-10-26 | Existing POS models with bert would be broken when looking at the args, so in... | John Bauer |
2022-10-26 | Fix retag package when using relabel_tags | John Bauer |