- check that mert-moses.pl emits devset score after every iteration - correctly for whichever metric we are optimizing - even when using --pairwise-ranked (PRO) - this may make use of 'evaluator', soon to be added by Matous Machacek - check that --pairwise-ranked is compatible with all optimization metrics - Use better random generators in util/random.cc, e.g. boost::mt19937. - Support plugging of custom random generators. Pros: - In MERT, you might want to use the random restarting technique to avoid local optima. - PRO uses a sampling technique to choose candidate translation pairs from N-best lists, which means the choice of random generators seems to be important. Cons: - This change will require us to re-create the truth results for regression testing related to MERT and PRO because the new random generator will generate different numbers from the current generator does.