Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/torch/nn.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorSoumith Chintala <soumith@gmail.com>2017-05-21 20:48:19 +0300
committerGitHub <noreply@github.com>2017-05-21 20:48:19 +0300
commit78aac1a015ebba0655a7fdad8a4a09419b68da67 (patch)
treee534fc3f1f1192102fd4b5b25974abe6d4d7f9f2 /doc
parent482537275df7fde77cc4dcc1d93de33cbfafde9f (diff)
Revert "Revert "ClassNLLCriterion supports missing targets""revert-1217-revert-1215-ClassNLLCriterion-missing-target
Diffstat (limited to 'doc')
-rw-r--r--doc/criterion.md22
1 files changed, 16 insertions, 6 deletions
diff --git a/doc/criterion.md b/doc/criterion.md
index 0883b24..a3e1b2e 100644
--- a/doc/criterion.md
+++ b/doc/criterion.md
@@ -95,10 +95,10 @@ criterion.sizeAverage = false
## ClassNLLCriterion ##
```lua
-criterion = nn.ClassNLLCriterion([weights])
+criterion = nn.ClassNLLCriterion([weights, sizeAverage, ignoreIndex])
```
-The negative log likelihood criterion. It is useful to train a classification problem with `n` classes.
+The negative log likelihood (NLL) criterion. It is useful to train a classification problem with `n` classes.
If provided, the optional argument `weights` should be a 1D `Tensor` assigning weight to each of the classes.
This is particularly useful when you have an unbalanced training set.
@@ -113,11 +113,21 @@ The loss can be described as:
loss(x, class) = -x[class]
```
-or in the case of the `weights` argument it is specified as follows:
+or in the case of the `weights` argument, it is specified as follows:
```lua
loss(x, class) = -weights[class] * x[class]
```
-Due to the behaviour of the backend code, it is necessary to set sizeAverage to false when calculating losses *in non-batch mode*.
+
+or in the case of the `ignoreIndex` argument:
+```
+loss(x, class) = class != ignoreIndex ? -weights[class] * x[class] : 0
+```
+
+Indeed, the `ignoreIndex` (defaults to -100) specifies a value for targets to be ignored.
+The commensurate `gradInput` for that target will be zero.
+When `sizeAverage=true` (the default), the `gradInput` and `output` are averaged over non-ignored targets.
+
+Due to the behaviour of the backend code, it is necessary to set `sizeAverage` to false when calculating losses *in non-batch mode*.
The following is a code fragment showing how to make a gradient step given an input `x`, a desired output `y` (an integer `1` to `n`, in this case `n = 2` classes), a network `mlp` and a learning rate `learningRate`:
@@ -133,7 +143,7 @@ function gradUpdate(mlp, x, y, learningRate)
end
```
-By default, the losses are averaged over observations for each minibatch. However, if the field `sizeAverage` is set to `false`, the losses are instead summed for each minibatch.
+By default, the losses are averaged over observations for each minibatch. However, if the argument `sizeAverage` is set to `false`, the losses are instead summed for each minibatch.
<a name="nn.CrossEntropyCriterion"></a>
@@ -758,7 +768,7 @@ Sample example
tripleModel = nn.ParallelTable()
tripleModel:add(embeddingModel)
- tripleModel:add(embeddingModel:clone('weight', 'bias',
+ tripleModel:add(embeddingModel:clone('weight', 'bias',
'gradWeight', 'gradBias'))
tripleModel:add(embeddingModel:clone('weight', 'bias',
'gradWeight', 'gradBias'))