final_regressor = models/0021.model Num weight bits = 16 learning rate = 10 initial_t = 1 power_t = 0.5 decay_learning_rate = 1 randomly initializing neural network output weights and hidden bias creating cache_file = train-sets/3parity.cache Reading from train-sets/3parity num sources = 1 average since example example current current current loss last counter weight label predict features 1.435731 1.435731 3 3.0 1.0000 -1.0000 4 1.965783 2.495835 6 6.0 1.0000 0.8038 4 2.377791 2.872201 11 11.0 1.0000 -1.0000 4 2.396167 2.414544 22 22.0 1.0000 0.7269 4 2.680468 2.964769 44 44.0 -1.0000 0.9079 4 2.787991 2.898014 87 87.0 -1.0000 1.0000 4 2.829916 2.871841 174 174.0 1.0000 0.8361 4 2.747754 2.665593 348 348.0 -1.0000 -0.0307 4 2.313654 1.879553 696 696.0 1.0000 -0.7951 4 2.025584 1.737515 1392 1392.0 1.0000 -0.6059 4 1.454971 0.884358 2784 2784.0 1.0000 0.8282 4 1.066638 0.678304 5568 5568.0 1.0000 1.0000 4 0.550489 0.034248 11135 11135.0 -1.0000 -1.0000 4 finished run number of examples = 16000 weighted example sum = 1.6e+004 weighted label sum = 0 average loss = 0.3831 best constant = -6.25e-005 total feature number = 64000