github.com/stanfordnlp/stanza.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2022-09-15	Bump version number to release a few small changesHEAD v1.4.2 main	John Bauer

2022-09-15	Hide the imports of SiLU and Mish from older versions of torch. #1120	John Bauer

2022-09-15	Switch dict + list to OrderedDict	John Bauer

2022-09-14	Squeeze a little bit more - only use depparse in the depparse pipelinev1.4.1	John Bauer

2022-09-14	Turn some multilingual pipeline tests into fixtures. Again, should save memory	John Bauer

2022-09-14	Turn some pipelines getting built over and over into fixtures. Will make ↵	John Bauer
	them take up less GPU memory, even if the cleanup isn't reliable
2022-09-14	Now there should be POS models which match the PL charlms as well	John Bauer

2022-09-14	Simpler way to have PL charlm specific for NER	John Bauer

2022-09-14	Try to reduce the scope on various pipelines to make the test suite less ↵	John Bauer
	likely to run out of GPU memory. Not sure this is the correct approach
2022-09-14	Lower log level on some messages we don't want written to the pipeline	John Bauer

2022-09-14	Oops, bugfix. Otherwise you get the whole dictionary for a language/model ↵	John Bauer
	pair where the model is not in the dictionary
2022-09-13	Temporarily don't include charlm in the POS models for PL - those haven't ↵	John Bauer
	retrained yet
2022-09-13	Add charlm to the sentiment dependencies when building resources.json	John Bauer

2022-09-13	PL now has an NER model	John Bauer

2022-09-13	Add a couple sentiment models for v1.4.1	John Bauer

2022-09-13	Add a tool to evaluate treebanks that are written out by a parser, such as ↵	John Bauer
	when the constiuency_parser has --predict_file turned on. Allows for easy checking of what happens when multiple models are mixed together.
2022-09-13	Refactor a little bit. Make it so the scoring interface can handle either ↵	John Bauer
	scored trees or trees with no score (another option would be to attach the score directly to a tree)
2022-09-13	Default trees written with format _O	John Bauer

2022-09-12	Don't double save_dir if the user gives save_dir as part of the model filename	John Bauer

2022-09-12	Fix remove_optimizer mode	John Bauer

2022-09-12	Throw out batches which had gone to NaN. Log the number of times it happens	John Bauer

2022-09-10	Not sure why, but AnCora xpos tags are half-finished	John Bauer

2022-09-10	Add a debug log line to reloading optimizers in conparse	John Bauer

2022-09-10	Fix typo	John Bauer

2022-09-10	Add some doc and update dev_sentence -> dep_sentence to reflect where the ↵	John Bauer
	variable comes from
2022-09-10	Fix a few tag errors when reading VIT constituents	John Bauer

2022-09-10	Always save checkpoints. Always load from a checkpoint if one exists.con_checkpoint	John Bauer
	Build the constituency optimizer using knowledge of how far you are in the training process - multistage part 1 gets Adadelta, for example Test that a multistage training process builds the correct optimizers, including when reloading When continuing training from a checkpoint, use the existing epochs_trained Restart epochs count when doing a finetune
2022-09-10	Technically it works the old way, but the filenames look silly	John Bauer

2022-09-10	Verify that hooks behave as expected when loading & saving	John Bauer

2022-09-08	Add a method to get the constituents known by a conparser, as requested in #1066	John Bauer

2022-09-08	NER get_known_tags possibly applies to multiple models	John Bauer

2022-09-08	relearn_structure should reuse the foundation_cache if possible	John Bauer

2022-09-08	Use the same foundation cache as the retag_pipeline to avoid reloading the ↵	John Bauer
	same pretrains multiple times in the constituency
2022-09-08	A script which convert Sindhi tokenization from Isra university	John Bauer
	Can also be applied to other similar datasets Read sentences & use the tokenization module to align the tokens with the original text Randomly split the sentences Write out the sentences and prepare their labels
2022-09-08	Add a function which adds fake dependencies (if regular dependencies are ↵	John Bauer
	missing) to a list of conllu lines. Needed for processing conllu files with eval.py if a dataset doesn't have deps
2022-09-08	Rearrange a bunch of functions from prepare_tokenizer_treebank to a common file	John Bauer
	Move the read/write conllu functions to a common folder so they can be used elsewhere Move the MWT_RE etc as well Move prepare_treebank_labels to common (and rename it) Move convert_conllu_to_txt as well Refactor a tokenizer_conllu_name function
2022-09-08	Reformat epoch logging in the conparser	John Bauer

2022-09-08	Eliminate a redundant function call	John Bauer

2022-09-08	Update a comment on a sentence being eliminated in constituency VIT	John Bauer

2022-09-07	Add a test of small cache size in the multilingual pipeline	John Bauer

2022-09-07	remove -> pop for dict. Addresses #1115	John Bauer

2022-09-07	Add pytest marks to the langid tests	John Bauer

2022-09-07	Separate the langid test into two separate test scripts	John Bauer

2022-09-07	Update for the latest version of the constituency treebank	John Bauer
	some sentences fixed in UD, some updates to the constituency treebank
2022-09-07	If a ValueError happens while tokenizing, try to make it a bit more ↵	John Bauer
	descriptive. Apparently does not impact tokenization time
2022-09-06	Temporarily extract a .tar.gz file if it's not extracted on the file system	John Bauer

2022-09-06	Sort subfolders so that results are reproducible	John Bauer

2022-09-06	Import either lxml or ElementTree. ElementTree is slower, but doesn't ↵	John Bauer
	require an additional module dependency
2022-09-06	NAMESPACES -> NAMESPACE, replace all xpath with findall	John Bauer

2022-09-06	Mostly pl_ner conversion test - tests the conversion of the XML so far	John Bauer