Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/moses-smt/mosesdecoder.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorColin Cherry <colin.a.cherry@gmail.com>2012-05-29 21:38:57 +0400
committerColin Cherry <colin.a.cherry@gmail.com>2012-05-29 21:38:57 +0400
commitfd577d7a65cab923b9102d61873a032654d573a1 (patch)
tree24dddd8e7a412f29f2f55e8ecad0b6055f8530c0 /mert/FeatureStats.cpp
parent6d1165654caf8edc995a41a4c6c9666e65ebce96 (diff)
Batch k-best MIRA is written and integrated into mert-moses.pl
Regression tests all check out, and kbmira seems to work fine on a Hansard French->English task. HypPackEnumerator class may be of interest to pro.cpp and future optimizers, as it abstracts a lot of the boilerplate involved in enumerating multiple k-best lists. MiraWeightVector is not really mira-specific - just a weight vector that enables efficient averaging. Could be useful to a perceptron as well. Same goes for MiraFeatureVector. Interaction with sparse features is written, but untested.
Diffstat (limited to 'mert/FeatureStats.cpp')
-rw-r--r--mert/FeatureStats.cpp39
1 files changed, 39 insertions, 0 deletions
diff --git a/mert/FeatureStats.cpp b/mert/FeatureStats.cpp
index 5d7c5c7b4..2c6cdb88f 100644
--- a/mert/FeatureStats.cpp
+++ b/mert/FeatureStats.cpp
@@ -10,6 +10,8 @@
#include <fstream>
#include <cmath>
+#include <boost/functional/hash.hpp>
+
#include "Util.h"
using namespace std;
@@ -81,6 +83,43 @@ SparseVector operator-(const SparseVector& lhs, const SparseVector& rhs) {
return res;
}
+std::vector<std::size_t> SparseVector::feats() const {
+ std::vector<std::size_t> toRet;
+ for(fvector_t::const_iterator iter = m_fvector.begin();
+ iter!=m_fvector.end();
+ iter++) {
+ toRet.push_back(iter->first);
+ }
+ return toRet;
+}
+
+std::size_t SparseVector::encode(const std::string& name) {
+ name2id_t::const_iterator name2id_iter = m_name_to_id.find(name);
+ size_t id = 0;
+ if (name2id_iter == m_name_to_id.end()) {
+ id = m_id_to_name.size();
+ m_id_to_name.push_back(name);
+ m_name_to_id[name] = id;
+ } else {
+ id = name2id_iter->second;
+ }
+ return id;
+}
+
+std::string SparseVector::decode(std::size_t id) {
+ return m_id_to_name[id];
+}
+
+bool operator==(SparseVector const& item1, SparseVector const& item2) {
+ return item1.m_fvector==item2.m_fvector;
+}
+
+std::size_t hash_value(SparseVector const& item) {
+ boost::hash<SparseVector::fvector_t> hasher;
+ return hasher(item.m_fvector);
+}
+
+
FeatureStats::FeatureStats()
: m_available_size(kAvailableSize), m_entries(0),
m_array(new FeatureStatsType[m_available_size]) {}