final_regressor = models/0021.model
Num weight bits = 16
learning rate = 10
initial_t = 1
power_t = 0.5
decay_learning_rate = 1
randomly initializing neural network output weights and hidden bias
creating cache_file = train-sets/3parity.cache
Reading from train-sets/3parity
num sources = 1
average    since         example     example  current  current  current
loss       last          counter      weight    label  predict features
1.435731   1.435731            3         3.0   1.0000  -1.0000        4
1.965783   2.495835            6         6.0   1.0000   0.8038        4
2.377791   2.872201           11        11.0   1.0000  -1.0000        4
2.396167   2.414544           22        22.0   1.0000   0.7269        4
2.680468   2.964769           44        44.0  -1.0000   0.9079        4
2.787991   2.898014           87        87.0  -1.0000   1.0000        4
2.829916   2.871841          174       174.0   1.0000   0.8361        4
2.747754   2.665593          348       348.0  -1.0000  -0.0307        4
2.313654   1.879553          696       696.0   1.0000  -0.7951        4
2.025584   1.737515         1392      1392.0   1.0000  -0.6059        4
1.454971   0.884358         2784      2784.0   1.0000   0.8282        4
1.066638   0.678304         5568      5568.0   1.0000   1.0000        4
0.550489   0.034248        11135     11135.0  -1.0000  -1.0000        4

finished run
number of examples = 16000
weighted example sum = 1.6e+004
weighted label sum = 0
average loss = 0.3831
best constant = -6.25e-005
total feature number = 64000