Generating 3-grams for all namespaces.
Generating 1-skips for all namespaces.
Num weight bits = 18
learning rate = 2.56e+06
initial_t = 128000
power_t = 1
decay_learning_rate = 1
final_regressor = models/0001.model
creating cache_file = train-sets/0001.dat.cache
Reading datafile = train-sets/0001.dat
num sources = 1
average    since         example     example  current  current  current
loss       last          counter      weight    label  predict features
1.000000   1.000000            1         1.0   1.0000   0.0000      290
1.000000   1.000000            2         2.0   0.0000   1.0000      608
0.500351   0.000703            4         4.0   0.0000   0.0000      794
0.399940   0.299528            8         8.0   0.0000   0.0000      860
0.415501   0.431061           16        16.0   1.0000   0.9107      128
0.453621   0.491742           32        32.0   0.0000   0.5372      176
0.451956   0.450291           64        64.0   0.0000   0.0000      350
0.428071   0.404187          128       128.0   1.0000   1.0000      620
0.311152   0.194233          256       256.0   0.0000   0.0000      410
0.187697   0.064242          512       512.0   0.0000   0.0000      278
0.093848   0.000000         1024      1024.0   1.0000   1.0000      170

finished run
number of examples per pass = 200
passes used = 8
weighted example sum = 1600
weighted label sum = 728
average loss = 0.060063
best constant = 1.0069
total feature number = 717536