Num weight bits = 18 learning rate = 10 initial_t = 1 power_t = 0.5 creating cache_file = train-sets/wsj_small.dat.gz.cache Reading from train-sets/wsj_small.dat.gz num sources = 1 average since sequence example current label current predicted current cur cur predic. examples loss last counter weight sequence prefix sequence prefix features pass pol made gener. 0.810811 0.810811 1 37.000000 [ 1 2 3 1 4 ] [ 1 1 1 1 1 ] 1654 0 0 37 0 0.750000 0.666667 2 64.000000 [ 11 2 3 11 11 ] [ 9 11 9 11 9 ] 1194 0 0 841 37 0.709677 0.620690 3 93.000000 [ 14 10 13 9 1 ] [ 11 15 11 1 9 ] 1286 0 0 1465 64 0.713178 0.722222 4 129.000000 [ 3 4 6 3 1 ] [ 11 11 2 3 11 ] 1608 0 0 2098 93 0.693750 0.612903 5 160.000000 [ 19 3 10 2 1 ] [ 2 3 1 2 1 ] 1378 0 0 2903 129 0.658163 0.500000 6 196.000000 [ 19 2 22 4 3 ] [ 19 2 11 11 11 ] 1608 0 0 3619 160 0.659574 0.666667 7 235.000000 [ 10 2 3 1 10 ] [ 19 2 11 1 1 ] 1746 0 0 4428 196 0.594901 0.466102 12 353.000000 [ 5 12 11 11 21 ] [ 11 12 29 21 21 ] 1102 0 0 7509 328 0.481534 0.367521 25 704.000000 [ 10 13 22 4 9 ] [ 10 13 1 4 3 ] 1148 0 0 15642 678 0.399859 0.319328 57 1418.000000 [ 19 1 4 6 36 ] [ 19 5 4 6 5 ] 2252 0 0 31220 1368 finished run number of examples = 78 weighted example sum = 1932 weighted label sum = 0 average loss = 0.369 best constant = -0.0005179 total feature number = 85128