Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/stanfordnlp/stanza.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2022-05-30Script to remove unmodified pieces of pretrain from NER modelner_wvJohn Bauer
2022-05-30Notes on which embeddings are used for which NER, in the form of a map of def...John Bauer
2022-05-30Ignore unknown embedding words based on a switch (not sure this is useful or ...John Bauer
2022-05-30Separate delta embedding from base NER embeddingJohn Bauer
2022-05-30Update version number: removing 90% of the embedding from the NER models mean...John Bauer
2022-05-30Use save_name to check if a model already existsJohn Bauer
2022-05-29Try to generalize wikiner reading - currently the download format is aJohn Bauer
2022-05-28Merge pull request #1031 from stanfordnlp/refactor_dataloaderJohn Bauer
2022-05-28Simplify - can use torch tensors directly rather than first creating np arraysJohn Bauer
2022-05-28Pytorch dataloaderJohn Bauer
2022-05-28Start to refactor pieces of the tokenizer dataset into pieces so we can make ...John Bauer
2022-05-28Set default JA NER to GSDJohn Bauer
2022-05-27Merge pull request #1038 from stanfordnlp/ja_nerJohn Bauer
2022-05-27Convert the Megagon ja_gsdJohn Bauer
2022-05-27Generalize the conll -> iob conversion a bitJohn Bauer
2022-05-26Add arguments for epsilon and beta2 to initializing an AdamW optimizerJohn Bauer
2022-05-26Add the ability to read a single file from a zipfileJohn Bauer
2022-05-26Corner case: when limiting word vectors to pretty close to the length of vect...John Bauer
2022-05-25Add the ability to read .gz files to the pretrain conversionJohn Bauer
2022-05-24Adjust tab stopsJohn Bauer
2022-05-23do not install transformers library by default; now support m1 macosyuhui-zh15
2022-05-23Add a method to get the keys in a multivocabJohn Bauer
2022-05-23basic __str__ for a PipelineJohn Bauer
2022-05-23germeval2014 looks like a more reliable dataset, so make that the defaultJohn Bauer
2022-05-19numpy instead of torch is slightly faster in the small sentence regime, very ...John Bauer
2022-05-17labels is unused in tokenizer predictJohn Bauer
2022-05-17Remove unused importJohn Bauer
2022-05-16Add the skip_newlines test to the file reading version of the tokenizer data ...John Bauer
2022-05-16Abstract away labels() rather than having the eval code know the format of th...John Bauer
2022-05-16more specific Exception typeJohn Bauer
2022-05-16Get rid of the input_data field - was only used for tests, and the tests don'...John Bauer
2022-05-16For the MWT test, use the fake tokenizer files rather than putting in the fak...John Bauer
2022-05-16Factor out a method to write the input to temp files in a tokenizer testJohn Bauer
2022-05-16Add a tiny bit of docJohn Bauer
2022-05-16Merge pull request #1029 from stanfordnlp/refactor_tokJohn Bauer
2022-05-16Run some basic tests on the dictionary in the ZH tokenizerJohn Bauer
2022-05-16Rearrange - not necessary for this to be an inner functionJohn Bauer
2022-05-16whitespaceJohn Bauer
2022-05-16torch tensors instead of lists of numbersJohn Bauer
2022-05-15Separate out the label creation - no need to make a fake string of 0s at runtimeJohn Bauer
2022-05-15Add a test which pokes the DataLoader object to make sure it is processing da...John Bauer
2022-05-15Separate the next() functionality which advances an unfinished batch into a s...John Bauer
2022-05-14whitespace updateJohn Bauer
2022-05-14use save_name for consistencyJohn Bauer
2022-05-14Use save_name to load if load_name is not setJohn Bauer
2022-05-14zh -> zh-hans to match the names of other modelsJohn Bauer
2022-05-14Use weighted_f1 by default to pick a best modelJohn Bauer
2022-05-14Oops, got the test file name mixed up with trainJohn Bauer
2022-05-14Add bert to sentiment training. Includes loading it in pipelines and at test...John Bauer
2022-05-14Log average loss even when doing an interim analysisJohn Bauer