Num weight bits = 13 learning rate = 1 initial_t = 0 power_t = 0.5 using no cache Reading datafile = train-sets/wiki1K.dat num sources = 1 average since example example current current current loss last counter weight label predict features 10.149301 10.149301 1 1.0 unknown 0.0000 732 10.369812 10.590324 2 2.0 unknown 0.0000 27 10.325923 10.282033 4 4.0 unknown 0.0000 53 10.401762 10.477602 8 8.0 unknown 0.0000 60 10.356291 10.310820 16 16.0 unknown 0.0000 26 10.472940 10.589588 32 32.0 unknown 0.0000 125 10.474844 10.476749 64 64.0 unknown 0.0000 313 10.425304 10.375763 128 128.0 unknown 0.0000 50 10.005548 9.585792 256 256.0 unknown 0.0000 33 9.331692 8.657836 512 512.0 unknown 0.0000 26 finished run number of examples = 1000 weighted example sum = 1000 weighted label sum = 0 average loss = 8.87286 best constant = -nan total feature number = 86919