diff options
author | Colin Cherry <colin.a.cherry@gmail.com> | 2012-05-29 21:38:57 +0400 |
---|---|---|
committer | Colin Cherry <colin.a.cherry@gmail.com> | 2012-05-29 21:38:57 +0400 |
commit | fd577d7a65cab923b9102d61873a032654d573a1 (patch) | |
tree | 24dddd8e7a412f29f2f55e8ecad0b6055f8530c0 /mert/MiraWeightVector.h | |
parent | 6d1165654caf8edc995a41a4c6c9666e65ebce96 (diff) |
Batch k-best MIRA is written and integrated into mert-moses.pl
Regression tests all check out, and kbmira seems to work fine
on a Hansard French->English task.
HypPackEnumerator class may be of interest to pro.cpp and future
optimizers, as it abstracts a lot of the boilerplate involved in
enumerating multiple k-best lists.
MiraWeightVector is not really mira-specific - just a weight vector
that enables efficient averaging. Could be useful to a perceptron
as well. Same goes for MiraFeatureVector.
Interaction with sparse features is written, but untested.
Diffstat (limited to 'mert/MiraWeightVector.h')
-rw-r--r-- | mert/MiraWeightVector.h | 106 |
1 files changed, 106 insertions, 0 deletions
diff --git a/mert/MiraWeightVector.h b/mert/MiraWeightVector.h new file mode 100644 index 000000000..375858634 --- /dev/null +++ b/mert/MiraWeightVector.h @@ -0,0 +1,106 @@ +/* + * MiraWeightVector.h + * kbmira - k-best Batch MIRA + * + * A self-averaging weight-vector. Good for + * perceptron learning as well. + * + */ + +#ifndef MERT_MIRA_WEIGHT_VECTOR_H +#define MERT_MIRA_WEIGHT_VECTOR_H + +#include <vector> + +#include "MiraFeatureVector.h" + +using namespace std; + +class AvgWeightVector; + +class MiraWeightVector { +public: + /** + * Constructor, initializes to the zero vector + */ + MiraWeightVector(); + + /** + * Constructor with provided initial vector + * \param init Initial feature values + */ + MiraWeightVector(const vector<ValType>& init); + + /** + * Update a the model + * \param fv Feature vector to be added to the weights + * \param tau FV will be scaled by this value before update + */ + void update(const MiraFeatureVector& fv, float tau); + + /** + * Perform an empty update (affects averaging) + */ + void tick(); + + /** + * Score a feature vector according to the model + * \param fv Feature vector to be scored + */ + ValType score(const MiraFeatureVector& fv) const; + + /** + * Squared norm of the weight vector + */ + ValType sqrNorm() const; + + /** + * Return an averaged view of this weight vector + */ + AvgWeightVector avg(); + + friend class AvgWeightVector; + +private: + /** + * Updates a weight and lazily updates its total + */ + void update(size_t index, ValType delta); + + /** + * Make sure everyone's total is up-to-date + */ + void fixTotals(); + + /** + * Helper to handle out-of-range weights + */ + ValType weight(size_t index) const; + + vector<ValType> m_weights; + vector<ValType> m_totals; + vector<size_t> m_lastUpdated; + size_t m_numUpdates; +}; + +/** + * Averaged view of a weight vector + */ +class AvgWeightVector { +public: + AvgWeightVector(const MiraWeightVector& wv); + ValType score(const MiraFeatureVector& fv) const; + ValType weight(size_t index) const; + size_t size() const; +private: + const MiraWeightVector& m_wv; +}; + + +#endif // MERT_WEIGHT_VECTOR_H + +// --Emacs trickery-- +// Local Variables: +// mode:c++ +// c-basic-offset:2 +// End: |