Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/marian.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-07-09Merged PR 19685: Marianize LSH as operators for mmapping and use in QuicksandMartin Junczys-Dowmunt
This PR turns the LSH index and search into a set of operators that live in the expression graph. This makes creation etc. thread-safe (one index per graph) and allows to later implement GPU versions. This allows to mmap the LSH as a Marian parameter since now we only need to turn the index into something that can be saved to disk using the existing tensors. This happens in marian_conv or the equivalent interface function in the Quicksand interface.
2021-03-02merge with internal masterMarcin Junczys-Dowmunt
2021-02-28Add graph operations documentation (#801)Graeme
* Doxygen structure for expression graph operators * Document arithmetic expression operations * Document comparison expression operations * Document exp/log and trig operations * Add missing implementation for cos/tan * Document expression manipulation operations * Document misc math operations * Overview of operators * Document activation functions * Document element-wise min/max * Document debugging/checkpoint operators * Document topk/argmin/argmax operations * Document index-based operations * Document reduction operations * Document lambda expression operators * Document product operations * Document softmax, cross-entropy, unlikelihood operations * Document dropout operations * Document scalar product and weighted average operations * Document layer normalization, highway and pooling operations * Document shift expression operator * Extra details on rules for adding specializations to .inc files * Add SinNodeOp example for specialization documentation * Additional details in tensor operator documentation * Remove brief command from doxygen comments * Prefer @ style doxygen functions to \ * Document n-ary function macros * Enable .cu and .inc files in documentation * Add a comment about ONNX mapping * Remove empty lines in doxygen * Update CHANGELOG Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-01-28Merged PR 17337: fp16 support for trainingMartin Junczys-Dowmunt
This PR refactors the training graph groups and optimizers to enable and simplify things for fp16 support. Deprecates old unused graph groups and fixes a couple of MPI issues.
2020-05-21Merged PR 12958: ONNX supportFrank Seide
This branch adds functionality to export ONNX models (with limitations).
2020-05-17Merged PR 12959: minor fixes from my old ONNX codeFrank Seide
These are minor comments/fixes I found when doing my ONNX prototype, would be good to get them out of the way
2020-05-15Merged PR 12874: Add topk operator and other small changes in preparation of ↵Martin Junczys-Dowmunt
LSH-based short-list replacement * Add tuple nodes via views and trickery * Add `topk` operator, currently unused outside unit tests * Add `abs` operator, currently unused outside unit tests * Change return type of `Node::allocate()` to `void`. This used to return the number of allocated elements, but isn't really used anywhere. To avoid future confusion of elements and bytes, removed for now.
2020-01-11Merged PR 11103: Clear cache for RNN object between batchesMartin Junczys-Dowmunt
* Clears cache for RNN object in transformer, otherwise stale tensor might be kept around. * Add missing `hash()` and `equal` functions everywhere. * Fixes bug from deployment test.
2019-09-12towards fp16 inferenceMarcin Junczys-Dowmunt
2019-09-10add initializers and more pieces of the type systemMarcin Junczys-Dowmunt
2019-09-07CPU-side compilation with new pointers and automatic vectorizationMarcin Junczys-Dowmunt
2019-06-21workaround for a segfaultFrank Seide
2019-04-30weird mode change backFrank Seide
2019-04-30weird mode changeFrank Seide
2019-04-27weirdo change of access permissionsFrank Seide
2019-02-13merged with latest updates of fseide/commentbeamsearchFrank Seide
2019-02-13bug fix: reshape() must verify that #elements does not change; bug fix: beam ↵Frank Seide
search must reshape first step correctly
2019-02-07merged from fseide/commentbeamsearchFrank Seide
2019-01-27add gelu activationMarcin Junczys-Dowmunt
2019-01-25Merged PR 6177: new operator log(sum(exp(x))), and a few moreFrank Seide
This PR adds the `logsumexp()` reduction, that is, y = log(sum_j exp(x_i)) With this, `logsoftmax(z, ax)` can now be written as `z - logsumexp(z, ax)`. I need this for factored projections. The PR merges the near-duplicates `sum()` and `mean()` into a single `ReduceNodeOpCode`, which, for good measure, I extended to also implement additional reductions. Since now we need additional reduction operations besides the sum, this PR changes the current `functional::Add()` operation into an `functional::Aggregate()` operation that takes a second `Functor` for the reduction operation. This made it straight-forward to implement a whole range of reduction operations (the names are the same as Numpy): * `sum()` * `mean()` * `std()` * `var()` * `min()` * `max()` * `logsumexp()` I just noticed that I forgot the gradient for `prod()`. Operator tests have been added and pass. NOTE: There are no gradient tests. Please review the gradients carefully. I will test `logsumexp()` by replacing `logsoftmax` by the above formula in training. Related work items: #98143
2019-01-23Merge branch 'fseide/indexops' into fseide/factoredembeddingsFrank Seide
2019-01-23changed index operations' parameter lists to match PyTorch parameter order ↵Frank Seide
(axis before arg)
2019-01-21Merge branch 'fseide/indexops' of ↵Frank Seide
https://machinetranslation.visualstudio.com/DefaultCollection/Marian/_git/marian-dev into fseide/factoredembeddings
2019-01-20bug fix: SliceViewNodeOp should forward value_type() correctlyFrank Seide
2019-01-20now routing rows() and cols() via index_select(), which then redistributes ↵Frank Seide
them to RowsNodeOp or ColsNodeOp; tests updated accordingly; bug fix: missed an axis normalization; bug fix: ReshapeNodeOp should pass on the value_type as to allow reshaping IndexType tensors
2019-01-19switched to memory-saving implementation of smoothingFrank Seide
2019-01-19(fixed an indentation)Frank Seide
2019-01-19bugbug: ReduceNodeOpCode::sumSqr should be meanSqrFrank Seide
2019-01-19added gradients for std() and var()Frank Seide
2019-01-19added gradients for min(), max(), and logsumexp()Frank Seide
2019-01-19added tests for all reduction operatorsFrank Seide
2019-01-19(minor bug fix)Frank Seide
2019-01-19resolved a template ambiguity, still not compiling on gcc for nowFrank Seide
2019-01-19(towards GPU aggregator)Frank Seide
2019-01-18first shot at extending Reduce() with a redunction functor (CPU only so far)Frank Seide
2019-01-18towards unifying reduction operatorsFrank Seide
2018-12-27generalized step() to narrow() and sliceView(), new class Slice;Frank Seide
bug fix: SliceViewNodeOp should use correct size for memory piece; new operation stopGradient()
2018-12-13minibatch-size warmup (manually merged over from fseide/covbias);Frank Seide
minibatches are now fed in GPU-sized chunks rather than a massive joint batch for all GPUs in the update; Adam hyper-parameter adjustment limited to learning rate, as momentum adjustment is counterproductive for MB scaling; log output now includes the last batch size; log output now shows current best for stalled validation metrics; bug fix: Adam optimizer should persist denominators; bug fix: Adam and Adagrad should use correct element size when persisting; min and max renamed to minimum and maximum, for consistency with other toolkits; pathie now compiles in manual VS Project
2018-11-29fix all warningsMarcin Junczys-Dowmunt
2018-10-01get rid of masked softmax, just use logMask as in transformer.hMarcin Junczys-Dowmunt
2018-09-30use more integer tensorsMarcin Junczys-Dowmunt
2018-09-30towards using integer tensorsMarcin Junczys-Dowmunt
2018-09-30Make Word uint32_t and introduce IndexTypeMarcin Junczys-Dowmunt
2018-09-27Rename ax_ to axis_Roman Grundkiewicz
2018-09-16remove more keywordsMarcin Junczys-Dowmunt
2018-09-16get rid of keywordsMarcin Junczys-Dowmunt
2018-09-14get rid of boost::hash_combineMarcin Junczys-Dowmunt
2018-08-28fix transpose operatorMarcin Junczys-Dowmunt
2018-08-26working guided alignment and alignment computation during translationmarcinj
2018-08-23added missing override specifiersFrank Seide