Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/torch/nn.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authornicholas-leonard <nick@nikopia.org>2015-08-11 04:42:01 +0300
committernicholas-leonard <nick@nikopia.org>2015-08-11 04:42:01 +0300
commit9650d23e77032ebbd65fc60e50571498eb7263d6 (patch)
tree94e82a28bf102b1301b27fe623b8afef8b64d947
parentd72e7a6949f55694a62fb490726ef9f5758ea059 (diff)
doc readthedocs
-rw-r--r--README.md2
-rw-r--r--doc/containers.md25
-rwxr-xr-xdoc/convolution.md51
-rwxr-xr-xdoc/criterion.md42
-rw-r--r--doc/index.md23
-rwxr-xr-xdoc/module.md48
-rw-r--r--doc/overview.md14
-rwxr-xr-xdoc/simple.md61
-rwxr-xr-xdoc/table.md35
-rw-r--r--doc/training.md14
-rwxr-xr-xdoc/transfer.md32
-rw-r--r--mkdocs.yml18
12 files changed, 207 insertions, 158 deletions
diff --git a/README.md b/README.md
index 907be66..378a440 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,5 @@
[![Build Status](https://travis-ci.org/torch/nn.svg?branch=master)](https://travis-ci.org/torch/nn)
-<a name="nn.dok"/>
+<a name="nn.dok"></a>
# Neural Network Package #
This package provides an easy and modular way to build and train simple or complex neural networks using [Torch](https://github.com/torch/torch7/blob/master/README.md):
diff --git a/doc/containers.md b/doc/containers.md
index d691f41..8d02ab9 100644
--- a/doc/containers.md
+++ b/doc/containers.md
@@ -1,6 +1,7 @@
-<a name="nn.Containers"/>
+<a name="nn.Containers"></a>
# Containers #
Complex neural networks are easily built using container classes:
+
* [Container](#nn.Container) : abstract class inherited by containers ;
* [Sequential](#nn.Sequential) : plugs layers in a feed-forward fully connected manner ;
* [Parallel](#nn.Parallel) : applies its `ith` child module to the `ith` slice of the input Tensor ;
@@ -9,7 +10,7 @@ Complex neural networks are easily built using container classes:
See also the [Table Containers](#nn.TableContainers) for manipulating tables of [Tensors](https://github.com/torch/torch7/blob/master/doc/tensor.md).
-<a name="nn.Container"/>
+<a name="nn.Container"></a>
## Container ##
This is an abstract [Module](module.md#nn.Module) class which declares methods defined in all containers.
@@ -17,19 +18,19 @@ It reimplements many of the Module methods such that calls are propagated to the
contained modules. For example, a call to [zeroGradParameters](module.md#nn.Module.zeroGradParameters)
will be propagated to all contained modules.
-<a name="nn.Container.add"/>
+<a name="nn.Container.add"></a>
### add(module) ###
Adds the given `module` to the container. The order is important
-<a name="nn.Container.get"/>
+<a name="nn.Container.get"></a>
### get(index) ###
Returns the contained modules at index `index`.
-<a name="nn.Container.size"/>
+<a name="nn.Container.size"></a>
### size() ###
Returns the number of contained modules.
-<a name="nn.Sequential"/>
+<a name="nn.Sequential"></a>
## Sequential ##
Sequential provides a means to plug layers together
@@ -51,7 +52,7 @@ which gives the output:
[torch.Tensor of dimension 1]
```
-<a name="nn.Sequential.remove"/>
+<a name="nn.Sequential.remove"></a>
### remove([index]) ###
Remove the module at the given `index`. If `index` is not specified, remove the last layer.
@@ -71,7 +72,7 @@ nn.Sequential {
```
-<a name="nn.Sequential.insert"/>
+<a name="nn.Sequential.insert"></a>
### insert(module, [index]) ###
Inserts the given `module` at the given `index`. If `index` is not specified, the incremented length of the sequence is used and so this is equivalent to use `add(module)`.
@@ -92,7 +93,7 @@ nn.Sequential {
-<a name="nn.Parallel"/>
+<a name="nn.Parallel"></a>
## Parallel ##
`module` = `Parallel(inputDimension,outputDimension)`
@@ -149,7 +150,7 @@ end
```
-<a name="nn.Concat"/>
+<a name="nn.Concat"></a>
## Concat ##
```lua
@@ -179,7 +180,7 @@ which gives the output:
[torch.Tensor of dimension 10]
```
-<a name="nn.DepthConcat"/>
+<a name="nn.DepthConcat"></a>
## DepthConcat ##
```lua
@@ -273,7 +274,7 @@ module output tensors non-`dim` sizes aren't all odd or even.
Such that in order to keep the mappings aligned, one need
only ensure that these be all odd (or even).
-<a name="nn.TableContainers"/>
+<a name="nn.TableContainers"></a>
## Table Containers ##
While the above containers are used for manipulating input [Tensors](https://github.com/torch/torch7/blob/master/doc/tensor.md), table containers are used for manipulating tables :
* [ConcatTable](table.md#nn.ConcatTable)
diff --git a/doc/convolution.md b/doc/convolution.md
index 8d9e77b..4f716c6 100755
--- a/doc/convolution.md
+++ b/doc/convolution.md
@@ -1,7 +1,8 @@
-<a name="nn.convlayers.dok"/>
+<a name="nn.convlayers.dok"></a>
# Convolutional layers #
A convolution is an integral that expresses the amount of overlap of one function `g` as it is shifted over another function `f`. It therefore "blends" one function with another. The neural network package supports convolution, pooling, subsampling and other relevant facilities. These are divided base on the dimensionality of the input and output [Tensors](https://github.com/torch/torch7/blob/master/doc/tensor.md#tensor):
+
* [Temporal Modules](#nn.TemporalModules) apply to sequences with a one-dimensional relationship
(e.g. sequences of words, phonemes and letters. Strings of some kind).
* [TemporalConvolution](#nn.TemporalConvolution) : a 1D convolution over an input sequence ;
@@ -25,7 +26,7 @@ a kernel for computing the weighted average in a neighborhood ;
* [VolumetricMaxPooling](#nn.VolumetricMaxPooling) : a 3D max-pooling operation over an input video.
* [VolumetricAveragePooling](#nn.VolumetricAveragePooling) : a 3D average-pooling operation over an input video.
-<a name="nn.TemporalModules"/>
+<a name="nn.TemporalModules"></a>
## Temporal Modules ##
Excluding an optional first batch dimension, temporal layers expect a 2D Tensor as input. The
first dimension is the number of frames in the sequence (e.g. `nInputFrame`), the last dimenstion
@@ -35,7 +36,7 @@ of dimensions, although the size of each dimension may change. These are commonl
Note: The [LookupTable](#nn.LookupTable) is special in that while it does output a temporal Tensor of size `nOutputFrame x outputFrameSize`,
its input is a 1D Tensor of indices of size `nIndices`. Again, this is excluding the option first batch dimension.
-<a name="nn.TemporalConvolution"/>
+<a name="nn.TemporalConvolution"></a>
## TemporalConvolution ##
```lua
@@ -121,7 +122,7 @@ which gives:
-0.63871422284166
```
-<a name="nn.TemporalMaxPooling"/>
+<a name="nn.TemporalMaxPooling"></a>
## TemporalMaxPooling ##
```lua
@@ -139,7 +140,7 @@ If the input sequence is a 2D tensor of dimension `nInputFrame x inputFrameSize`
nOutputFrame = (nInputFrame - kW) / dW + 1
```
-<a name="nn.TemporalSubSampling"/>
+<a name="nn.TemporalSubSampling"></a>
## TemporalSubSampling ##
```lua
@@ -175,7 +176,7 @@ The output value of the layer can be precisely described as:
output[i][t] = bias[i] + weight[i] * sum_{k=1}^kW input[i][dW*(t-1)+k)]
```
-<a name="nn.LookupTable"/>
+<a name="nn.LookupTable"></a>
## LookupTable ##
```lua
@@ -253,13 +254,13 @@ Outputs something like:
[torch.DoubleTensor of dimension 2x4x3]
```
-<a name="nn.SpatialModules"/>
+<a name="nn.SpatialModules"></a>
## Spatial Modules ##
Excluding and optional batch dimension, spatial layers expect a 3D Tensor as input. The
first dimension is the number of features (e.g. `frameSize`), the last two dimenstions
are spatial (e.g. `height x width`). These are commonly used for processing images.
-<a name="nn.SpatialConvolution"/>
+<a name="nn.SpatialConvolution"></a>
### SpatialConvolution ###
```lua
@@ -303,7 +304,7 @@ output[i][j][k] = bias[k]
```
-<a name="nn.SpatialConvolutionMap"/>
+<a name="nn.SpatialConvolutionMap"></a>
### SpatialConvolutionMap ###
```lua
@@ -317,7 +318,7 @@ connection table between input and output features. The
using a [full connection table](#nn.tables.full). One can specify
different types of connection tables.
-<a name="nn.tables.full"/>
+<a name="nn.tables.full"></a>
#### Full Connection Table ####
```lua
@@ -327,7 +328,7 @@ table = nn.tables.full(nin,nout)
This is a precomputed table that specifies connections between every
input and output node.
-<a name="nn.tables.onetoone"/>
+<a name="nn.tables.onetoone"></a>
#### One to One Connection Table ####
```lua
@@ -337,7 +338,7 @@ table = nn.tables.oneToOne(n)
This is a precomputed table that specifies a single connection to each
output node from corresponding input node.
-<a name="nn.tables.random"/>
+<a name="nn.tables.random"></a>
#### Random Connection Table ####
```lua
@@ -348,7 +349,7 @@ This table is randomly populated such that each output unit has
`nto` incoming connections. The algorihtm tries to assign uniform
number of outgoing connections to each input node if possible.
-<a name="nn.SpatialLPPooling"/>
+<a name="nn.SpatialLPPooling"></a>
### SpatialLPPooling ###
```lua
@@ -357,7 +358,7 @@ module = nn.SpatialLPPooling(nInputPlane, pnorm, kW, kH, [dW], [dH])
Computes the `p` norm in a convolutional manner on a set of 2D input planes.
-<a name="nn.SpatialMaxPooling"/>
+<a name="nn.SpatialMaxPooling"></a>
### SpatialMaxPooling ###
```lua
@@ -379,7 +380,7 @@ oheight = op((height + 2*padH - kH) / dH + 1)
`op` is a rounding operator. By default, it is `floor`. It can be changed
by calling `:ceil()` or `:floor()` methods.
-<a name="nn.SpatialAveragePooling"/>
+<a name="nn.SpatialAveragePooling"></a>
### SpatialAveragePooling ###
```lua
@@ -390,7 +391,7 @@ Applies 2D average-pooling operation in `kWxkH` regions by step size
`dWxdH` steps. The number of output features is equal to the number of
input planes.
-<a name="nn.SpatialAdaptiveMaxPooling"/>
+<a name="nn.SpatialAdaptiveMaxPooling"></a>
### SpatialAdaptiveMaxPooling ###
```lua
@@ -413,7 +414,7 @@ y_i_start = floor((i /oheight) * iheight)
y_i_end = ceil(((i+1)/oheight) * iheight)
```
-<a name="nn.SpatialSubSampling"/>
+<a name="nn.SpatialSubSampling"></a>
### SpatialSubSampling ###
```lua
@@ -454,7 +455,7 @@ output[i][j][k] = bias[k]
+ weight[k] sum_{s=1}^kW sum_{t=1}^kH input[dW*(i-1)+s)][dH*(j-1)+t][k]
```
-<a name="nn.SpatialUpSamplingNearest"/>
+<a name="nn.SpatialUpSamplingNearest"></a>
### SpatialUpSamplingNearest ###
```lua
@@ -475,7 +476,7 @@ output(u,v) = input(floor((u-1)/scale)+1, floor((v-1)/scale)+1)
Where `u` and `v` are index from 1 (as per lua convention). There are no learnable parameters.
-<a name="nn.SpatialZeroPadding"/>
+<a name="nn.SpatialZeroPadding"></a>
### SpatialZeroPadding ###
```lua
@@ -485,7 +486,7 @@ module = nn.SpatialZeroPadding(padLeft, padRight, padTop, padBottom)
Each feature map of a given input is padded with specified number of
zeros. If padding values are negative, then input is cropped.
-<a name="nn.SpatialSubtractiveNormalization"/>
+<a name="nn.SpatialSubtractiveNormalization"></a>
### SpatialSubtractiveNormalization ###
```lua
@@ -522,7 +523,7 @@ w2=image.display(processed)
```
![](image/lena.jpg)![](image/lenap.jpg)
-<a name="nn.SpatialBatchNormalization"/>
+<a name="nn.SpatialBatchNormalization"></a>
## SpatialBatchNormalization ##
`module` = `nn.SpatialBatchNormalization(N [,eps] [, momentum] [,affine])`
@@ -565,13 +566,13 @@ A = torch.randn(b, m, h, w)
C = model.forward(A) -- C will be of size `b x m x h x w`
```
-<a name="nn.VolumetricModules"/>
+<a name="nn.VolumetricModules"></a>
## Volumetric Modules ##
Excluding and optional batch dimension, volumetric layers expect a 4D Tensor as input. The
first dimension is the number of features (e.g. `frameSize`), the second is sequential (e.g. `time`) and the
last two dimenstions are spatial (e.g. `height x width`). These are commonly used for processing videos (sequences of images).
-<a name="nn.VolumetricConvolution"/>
+<a name="nn.VolumetricConvolution"></a>
### VolumetricConvolution ###
```lua
@@ -608,7 +609,7 @@ size `nOutputPlane x nInputPlane x kT x kH x kW`) and `self.bias` (Tensor of
size `nOutputPlane`). The corresponding gradients can be found in
`self.gradWeight` and `self.gradBias`.
-<a name="nn.VolumetricMaxPooling"/>
+<a name="nn.VolumetricMaxPooling"></a>
### VolumetricMaxPooling ###
```lua
@@ -619,7 +620,7 @@ Applies 3D max-pooling operation in `kTxkWxkH` regions by step size
`dTxdWxdH` steps. The number of output features is equal to the number of
input planes / dT.
-<a name="nn.VolumetricAveragePooling"/>
+<a name="nn.VolumetricAveragePooling"></a>
### VolumetricAveragePooling ###
```lua
diff --git a/doc/criterion.md b/doc/criterion.md
index 64e6d63..4f89338 100755
--- a/doc/criterion.md
+++ b/doc/criterion.md
@@ -1,4 +1,4 @@
-<a name="nn.Criterions"/>
+<a name="nn.Criterions"></a>
# Criterions #
[`Criterions`](#nn.Criterion) are helpful to train a neural network. Given an input and a
@@ -24,13 +24,13 @@ target, they compute a gradient according to a given loss function.
* [`ParallelCriterion`](#nn.ParallelCriterion) : a weighted sum of other criterions each applied to a different input and target;
* [`MarginRankingCriterion`](#nn.MarginRankingCriterion): ranks two inputs;
-<a name="nn.Criterion"/>
+<a name="nn.Criterion"></a>
## Criterion ##
This is an abstract class which declares methods defined in all criterions.
This class is [serializable](https://github.com/torch/torch7/blob/master/doc/file.md#serialization-methods).
-<a name="nn.Criterion.forward"/>
+<a name="nn.Criterion.forward"></a>
### [output] forward(input, target) ###
Given an `input` and a `target`, compute the loss function associated to the criterion and return the result.
@@ -41,7 +41,7 @@ The `output` returned should be a scalar in general.
The state variable [`self.output`](#nn.Criterion.output) should be updated after a call to `forward()`.
-<a name="nn.Criterion.backward"/>
+<a name="nn.Criterion.backward"></a>
### [gradInput] backward(input, target) ###
Given an `input` and a `target`, compute the gradients of the loss function associated to the criterion and return the result.
@@ -50,19 +50,19 @@ In general `input`, `target` and `gradInput` are [`Tensor`s](..:torch:tensor), b
The state variable [`self.gradInput`](#nn.Criterion.gradInput) should be updated after a call to `backward()`.
-<a name="nn.Criterion.output"/>
+<a name="nn.Criterion.output"></a>
### State variable: output ###
State variable which contains the result of the last [`forward(input, target)`](#nn.Criterion.forward) call.
-<a name="nn.Criterion.gradInput"/>
+<a name="nn.Criterion.gradInput"></a>
### State variable: gradInput ###
State variable which contains the result of the last [`backward(input, target)`](#nn.Criterion.backward) call.
-<a name="nn.AbsCriterion"/>
+<a name="nn.AbsCriterion"></a>
## AbsCriterion ##
```lua
@@ -85,7 +85,7 @@ criterion.sizeAverage = false
```
-<a name="nn.ClassNLLCriterion"/>
+<a name="nn.ClassNLLCriterion"></a>
## ClassNLLCriterion ##
```lua
@@ -128,7 +128,7 @@ end
```
-<a name="nn.CrossEntropyCriterion"/>
+<a name="nn.CrossEntropyCriterion"></a>
## CrossEntropyCriterion ##
```lua
@@ -157,7 +157,7 @@ loss(x, class) = weights[class] * (-x[class] + log(\sum_j exp(x[j])))
```
-<a name="nn.DistKLDivCriterion"/>
+<a name="nn.DistKLDivCriterion"></a>
## DistKLDivCriterion ##
```lua
@@ -177,7 +177,7 @@ loss(x, target) = \sum(target_i * (log(target_i) - x_i))
```
-<a name="nn.BCECriterion"/>
+<a name="nn.BCECriterion"></a>
## BCECriterion
```lua
@@ -193,7 +193,7 @@ loss(t, o) = -(t * log(o) + (1 - t) * log(1 - o))
This is used for measuring the error of a reconstruction in for example an auto-encoder.
-<a name="nn.MarginCriterion"/>
+<a name="nn.MarginCriterion"></a>
## MarginCriterion ##
```lua
@@ -256,7 +256,7 @@ gives the output:
i.e. the mlp successfully separates the two data points such that they both have a `margin` of `1`, and hence a loss of `0`.
-<a name="nn.MultiMarginCriterion"/>
+<a name="nn.MultiMarginCriterion"></a>
## MultiMarginCriterion ##
```lua
@@ -281,7 +281,7 @@ mlp:add(nn.MulConstant(-1)) -- distance to similarity
```
-<a name="nn.MultiLabelMarginCriterion"/>
+<a name="nn.MultiLabelMarginCriterion"></a>
## MultiLabelMarginCriterion ##
```lua
@@ -309,7 +309,7 @@ criterion:forward(input, target)
```
-<a name="nn.MSECriterion"/>
+<a name="nn.MSECriterion"></a>
## MSECriterion ##
```lua
@@ -333,7 +333,7 @@ criterion.sizeAverage = false
```
-<a name="nn.MultiCriterion"/>
+<a name="nn.MultiCriterion"></a>
## MultiCriterion ##
```lua
@@ -360,7 +360,7 @@ mc = nn.MultiCriterion():add(nll, 0.5):add(nll2)
output = mc:forward(input, target)
```
-<a name="nn.ParallelCriterion"/>
+<a name="nn.ParallelCriterion"></a>
## ParallelCriterion ##
```lua
@@ -390,7 +390,7 @@ output = pc:forward(input, target)
```
-<a name="nn.HingeEmbeddingCriterion"/>
+<a name="nn.HingeEmbeddingCriterion"></a>
## HingeEmbeddingCriterion ##
```lua
@@ -469,7 +469,7 @@ end
```
-<a name="nn.L1HingeEmbeddingCriterion"/>
+<a name="nn.L1HingeEmbeddingCriterion"></a>
## L1HingeEmbeddingCriterion ##
```lua
@@ -486,7 +486,7 @@ loss(x, y) = ⎨
The `margin` has a default value of `1`, or can be set in the constructor.
-<a name="nn.CosineEmbeddingCriterion"/>
+<a name="nn.CosineEmbeddingCriterion"></a>
## CosineEmbeddingCriterion ##
```lua
@@ -508,7 +508,7 @@ loss(x, y) = ⎨
```
-<a name="nn.MarginRankingCriterion"/>
+<a name="nn.MarginRankingCriterion"></a>
## MarginRankingCriterion ##
```lua
diff --git a/doc/index.md b/doc/index.md
new file mode 100644
index 0000000..5c36166
--- /dev/null
+++ b/doc/index.md
@@ -0,0 +1,23 @@
+[![Build Status](https://travis-ci.org/torch/nn.svg?branch=master)](https://travis-ci.org/torch/nn)
+<a name="nn.dok"></a>
+# Neural Network Package #
+
+This package provides an easy and modular way to build and train simple or complex neural networks using [Torch](https://github.com/torch/torch7/blob/master/README.md):
+
+ * Modules are the bricks used to build neural networks. Each are themselves neural networks, but can be combined with other networks using containers to create complex neural networks:
+ * [Module](module.md#nn.Module) : abstract class inherited by all modules;
+ * [Containers](containers.md#nn.Containers) : container classes like [Sequential](containers.md#nn.Sequential), [Parallel](containers.md#nn.Parallel) and [Concat](containers.md#nn.Concat);
+ * [Transfer functions](transfer.md#nn.transfer.dok) : non-linear functions like [Tanh](transfer.md#nn.Tanh) and [Sigmoid](transfer.md#nn.Sigmoid);
+ * [Simple layers](simple.md#nn.simplelayers.dok) : like [Linear](simple.md#nn.Linear), [Mean](simple.md#nn.Mean), [Max](simple.md#nn.Max) and [Reshape](simple.md#nn.Reshape);
+ * [Table layers](table.md#nn.TableLayers) : layers for manipulating tables like [SplitTable](table.md#nn.SplitTable), [ConcatTable](table.md#nn.ConcatTable) and [JoinTable](table.md#nn.JoinTable);
+ * [Convolution layers](convolution.md#nn.convlayers.dok) : [Temporal](convolution.md#nn.TemporalModules), [Spatial](convolution.md#nn.SpatialModules) and [Volumetric](convolution.md#nn.VolumetricModules) convolutions ;
+ * Criterions compute a gradient according to a given loss function given an input and a target:
+ * [Criterions](criterion.md#nn.Criterions) : a list of all criterions, including [Criterion](criterion.md#nn.Criterion), the abstract class;
+ * [MSECriterion](criterion.md#nn.MSECriterion) : the Mean Squared Error criterion used for regression;
+ * [ClassNLLCriterion](criterion.md#nn.ClassNLLCriterion) : the Negative Log Likelihood criterion used for classification;
+ * Additional documentation :
+ * [Overview](overview.md#nn.overview.dok) of the package essentials including modules, containers and training;
+ * [Training](training.md#nn.traningneuralnet.dok) : how to train a neural network using [StochasticGradient](training.md#nn.StochasticGradient);
+ * [Testing](testing.md) : how to test your modules.
+ * [Experimental Modules](https://github.com/clementfarabet/lua---nnx/blob/master/README.md) : a package containing experimental modules and criteria.
+
diff --git a/doc/module.md b/doc/module.md
index 50090c4..97e14a0 100755
--- a/doc/module.md
+++ b/doc/module.md
@@ -1,4 +1,4 @@
-<a name="nn.Module"/>
+<a name="nn.Module"></a>
## Module ##
`Module` is an abstract class which defines fundamental methods necessary
@@ -7,7 +7,7 @@ for a training a neural network. Modules are [serializable](https://github.com/t
Modules contain two states variables: [output](#output) and
[gradInput](#gradinput).
-<a name="nn.Module.forward"/>
+<a name="nn.Module.forward"></a>
### [output] forward(input) ###
Takes an `input` object, and computes the corresponding `output` of the
@@ -24,7 +24,7 @@ implement [updateOutput(input)](#nn.Module.updateOutput)
function. The forward module in the abstract parent class
[Module](#nn.Module) will call `updateOutput(input)`.
-<a name="nn.Module.backward"/>
+<a name="nn.Module.backward"></a>
### [gradInput] backward(input, gradOutput) ###
Performs a _backpropagation step_ through the module, with respect to the
@@ -52,14 +52,14 @@ is better to override
[accGradParameters(input, gradOutput,scale)](#nn.Module.accGradParameters)
functions.
-<a name="nn.Module.updateOutput"/>
+<a name="nn.Module.updateOutput"></a>
### updateOutput(input) ###
Computes the output using the current parameter set of the class and
input. This function returns the result which is stored in the
[output](#output) field.
-<a name="nn.Module.updateGradInput"/>
+<a name="nn.Module.updateGradInput"></a>
### updateGradInput(input, gradOutput) ###
Computing the gradient of the module with respect to its own
@@ -67,7 +67,7 @@ input. This is returned in `gradInput`. Also, the
[gradInput](#gradinput) state variable is updated
accordingly.
-<a name="nn.Module.accGradParameters"/>
+<a name="nn.Module.accGradParameters"></a>
### accGradParameters(input, gradOutput, scale) ###
Computing the gradient of the module with respect to its
@@ -83,7 +83,7 @@ Zeroing this accumulation is achieved with
the parameters according to this accumulation is done with
[updateParameters()](#nn.Module.updateParameters).
-<a name="nn.Module.zeroGradParameters"/>
+<a name="nn.Module.zeroGradParameters"></a>
### zeroGradParameters() ###
If the module has parameters, this will zero the accumulation of the
@@ -91,7 +91,7 @@ gradients with respect to these parameters, accumulated through
[accGradParameters(input, gradOutput,scale)](#nn.Module.accGradParameters)
calls. Otherwise, it does nothing.
-<a name="nn.Module.updateParameters"/>
+<a name="nn.Module.updateParameters"></a>
### updateParameters(learningRate) ###
If the module has parameters, this will update these parameters, according
@@ -104,7 +104,7 @@ parameters = parameters - learningRate * gradients_wrt_parameters
```
If the module does not have parameters, it does nothing.
-<a name="nn.Module.accUpdateGradParameters"/>
+<a name="nn.Module.accUpdateGradParameters"></a>
### accUpdateGradParameters(input, gradOutput, learningRate) ###
This is a convenience module that performs two functions at
@@ -136,7 +136,7 @@ As it can be seen, the gradients are accumulated directly into
weights. This assumption may not be true for a module that computes a
nonlinear operation.
-<a name="nn.Module.share"/>
+<a name="nn.Module.share"></a>
### share(mlp,s1,s2,...,sn) ###
This function modifies the parameters of the module named
@@ -174,7 +174,7 @@ print(mlp2:get(1).bias[1])
```
-<a name="nn.Module.clone"/>
+<a name="nn.Module.clone"></a>
### clone(mlp,...) ###
Creates a deep copy of (i.e. not just a pointer to) the module,
@@ -205,29 +205,29 @@ print(mlp2:get(1).bias[1])
```
-<a name="nn.Module.type"/>
+<a name="nn.Module.type"></a>
### type(type) ###
This function converts all the parameters of a module to the given
`type`. The `type` can be one of the types defined for
[torch.Tensor](https://github.com/torch/torch7/blob/master/doc/tensor.md).
-<a name="nn.Module.float"/>
+<a name="nn.Module.float"></a>
### float() ###
Convenience method for calling [module:type('torch.FloatTensor')](#nn.Module.type)
-<a name="nn.Module.double"/>
+<a name="nn.Module.double"></a>
### double() ###
Convenience method for calling [module:type('torch.DoubleTensor')](#nn.Module.type)
-<a name="nn.Module.cuda"/>
+<a name="nn.Module.cuda"></a>
### cuda() ###
Convenience method for calling [module:type('torch.CudaTensor')](#nn.Module.type)
-<a name="nn.statevars.dok"/>
+<a name="nn.statevars.dok"></a>
### State Variables ###
These state variables are useful objects if one wants to check the guts of
@@ -240,13 +240,13 @@ However, some special sub-classes
like [table layers](table.md#nn.TableLayers) contain something else. Please,
refer to each module specification for further information.
-<a name="nn.Module.output"/>
+<a name="nn.Module.output"></a>
#### output ####
This contains the output of the module, computed with the last call of
[forward(input)](#nn.Module.forward).
-<a name="nn.Module.gradInput"/>
+<a name="nn.Module.gradInput"></a>
#### gradInput ####
This contains the gradients with respect to the inputs of the module, computed with the last call of
@@ -258,7 +258,7 @@ Some modules contain parameters (the ones that we actually want to
train!). The name of these parameters, and gradients w.r.t these parameters
are module dependent.
-<a name="nn.Module.parameters"/>
+<a name="nn.Module.parameters"></a>
### [{weights}, {gradWeights}] parameters() ###
This function should returns two tables. One for the learnable
@@ -268,7 +268,7 @@ wrt to the learnable parameters `{gradWeights}`.
Custom modules should override this function if they use learnable
parameters that are stored in tensors.
-<a name="nn.Module.getParameters"/>
+<a name="nn.Module.getParameters"></a>
### [flatParameters, flatGradParameters] getParameters() ###
This function returns two tensors. One for the flattened learnable
@@ -279,15 +279,15 @@ Custom modules should not override this function. They should instead override [
This function will go over all the weights and gradWeights and make them view into a single tensor (one for weights and one for gradWeights). Since the storage of every weight and gradWeight is changed, this function should be called only once on a given network.
-<a name="nn.Module.training"/>
+<a name="nn.Module.training"></a>
### training() ###
This sets the mode of the Module (or sub-modules) to `train=true`. This is useful for modules like [Dropout](simple.md#nn.Dropout) that have a different behaviour during training vs evaluation.
-<a name="nn.Module.evaluate"/>
+<a name="nn.Module.evaluate"></a>
### evaluate() ###
This sets the mode of the Module (or sub-modules) to `train=false`. This is useful for modules like [Dropout](simple.md#nn.Dropout) that have a different behaviour during training vs evaluation.
-<a name="nn.Module.findModules"/>
+<a name="nn.Module.findModules"></a>
### findModules(typename) ###
Find all instances of modules in the network of a certain `typename`. It returns a flattened list of the matching nodes, as well as a flattened list of the container modules for each matching node.
@@ -331,7 +331,7 @@ for i = 1, #threshold_nodes do
end
```
-<a name="nn.Module.listModules"/>
+<a name="nn.Module.listModules"></a>
### listModules() ###
List all Modules instances in a network. Returns a flattened list of modules,
diff --git a/doc/overview.md b/doc/overview.md
index c9eedae..6aec321 100644
--- a/doc/overview.md
+++ b/doc/overview.md
@@ -1,4 +1,4 @@
-<a name="nn.overview.dok"/>
+<a name="nn.overview.dok"></a>
# Overview #
Each module of a network is composed of [Modules](module.md#nn.Modules) and there
@@ -23,31 +23,35 @@ easy with a simple for loop to [train a neural network yourself](training.md#nn.
## Detailed Overview ##
This section provides a detailed overview of the neural network package. First the omnipresent [Module](#nn.overview.module) is examined, followed by some examples for [combining modules](#nn.overview.plugandplay) together. The last part explores facilities for [training a neural network](#nn.overview.training).
-<a name="nn.overview.module"/>
+<a name="nn.overview.module"></a>
### Module ###
A neural network is called a [Module](module.md#nn.Module) (or simply
_module_ in this documentation) in Torch. `Module` is an abstract
class which defines four main methods:
+
* [forward(input)](module.md#nn.Module.forward) which computes the output of the module given the `input` [Tensor](https://github.com/torch/torch7/blob/master/doc/tensor.md).
* [backward(input, gradOutput)](module.md#nn.Module.backward) which computes the gradients of the module with respect to its own parameters, and its own inputs.
* [zeroGradParameters()](module.md#nn.Module.zeroGradParameters) which zeroes the gradient with respect to the parameters of the module.
* [updateParameters(learningRate)](module.md#nn.Module.updateParameters) which updates the parameters after one has computed the gradients with `backward()`
It also declares two members:
+
* [output](module.md#nn.Module.output) which is the output returned by `forward()`.
* [gradInput](module.md#nn.Module.gradInput) which contains the gradients with respect to the input of the module, computed in a `backward()`.
Two other perhaps less used but handy methods are also defined:
+
* [share(mlp,s1,s2,...,sn)](module.md#nn.Module.share) which makes this module share the parameters s1,..sn of the module `mlp`. This is useful if you want to have modules that share the same weights.
* [clone(...)](module.md#nn.Module.clone) which produces a deep copy of (i.e. not just a pointer to) this Module, including the current state of its parameters (if any).
Some important remarks:
+
* `output` contains only valid values after a [forward(input)](module.md#nn.Module.forward).
* `gradInput` contains only valid values after a [backward(input, gradOutput)](module.md#nn.Module.backward).
* [backward(input, gradOutput)](module.md#nn.Module.backward) uses certain computations obtained during [forward(input)](module.md#nn.Module.forward). You _must_ call `forward()` before calling a `backward()`, on the _same_ `input`, or your gradients are going to be incorrect!
-<a name="nn.overview.plugandplay"/>
+<a name="nn.overview.plugandplay"></a>
### Plug and play ###
Building a simple neural network can be achieved by constructing an available layer.
@@ -75,7 +79,7 @@ Of course, `Sequential` and `Concat` can contains other
networks you ever dreamt of! See the [[#nn.Modules|complete list of
available modules]].
-<a name="nn.overview.training"/>
+<a name="nn.overview.training"></a>
### Training a neural network ###
Once you built your neural network, you have to choose a particular
@@ -114,7 +118,7 @@ are implemented. [See an example](containers.md#nn.DoItStochasticGradient).
to cut-and-paste it and create a variant to it adapted to your needs
(if the constraints of `StochasticGradient` do not satisfy you).
-<a name="nn.overview.lowlevel"/>
+<a name="nn.overview.lowlevel"></a>
#### Low Level Training ####
If you want to program the `StochasticGradient` by hand, you
diff --git a/doc/simple.md b/doc/simple.md
index 6ef7ed2..bc4881b 100755
--- a/doc/simple.md
+++ b/doc/simple.md
@@ -1,6 +1,7 @@
-<a name="nn.simplelayers.dok"/>
+<a name="nn.simplelayers.dok"></a>
# Simple layers #
Simple Modules are used for various tasks like adapting Tensor methods and providing affine transformations :
+
* Parameterized Modules :
* [Linear](#nn.Linear) : a linear transformation ;
* [SparseLinear](#nn.SparseLinear) : a linear transformation with sparse inputs ;
@@ -36,7 +37,7 @@ Simple Modules are used for various tasks like adapting Tensor methods and provi
* [Padding](#nn.Padding) : adds padding to a dimension ;
* [L1Penalty](#nn.L1Penalty) : adds an L1 penalty to an input (for sparsity);
-<a name="nn.Linear"/>
+<a name="nn.Linear"></a>
## Linear ##
```lua
@@ -79,7 +80,7 @@ x = torch.Tensor(10) -- 10 inputs
y = module:forward(x)
```
-<a name="nn.SparseLinear"/>
+<a name="nn.SparseLinear"></a>
## SparseLinear ##
```lua
@@ -113,7 +114,7 @@ x = torch.Tensor({ {1, 0.1}, {2, 0.3}, {10, 0.3}, {31, 0.2} })
The first column contains indices, the second column contains values in a a vector where all other elements are zeros. The indices should not exceed the stated dimensions of the input to the layer (10000 in the example).
-<a name="nn.Dropout"/>
+<a name="nn.Dropout"></a>
## Dropout ##
```lua
@@ -183,7 +184,7 @@ We can return to training our model by first calling [Module:training()](module.
When used, `Dropout` should normally be applied to the input of parameterized [Modules](module.md#nn.Module) like [Linear](#nn.Linear) or [SpatialConvolution](convolution.md#nn.SpatialConvolution). A `p` of `0.5` (the default) is usually okay for hidden layers. `Dropout` can sometimes be used successfully on the dataset inputs with a `p` around `0.2`. It sometimes works best following [Transfer](transfer.md) Modules like [ReLU](transfer.md#nn.ReLU). All this depends a great deal on the dataset so its up to the user to try different combinations.
-<a name="nn.SpatialDropout"/>
+<a name="nn.SpatialDropout"></a>
## SpatialDropout ##
`module` = `nn.SpatialDropout(p)`
@@ -194,7 +195,7 @@ As described in the paper "Efficient Object Localization Using Convolutional Net
```nn.SpatialDropout``` accepts 3D or 4D inputs. If the input is 3D than a layout of (features x height x width) is assumed and for 4D (batch x features x height x width) is assumed.
-<a name="nn.Abs"/>
+<a name="nn.Abs"></a>
## Abs ##
```lua
@@ -214,7 +215,7 @@ gnuplot.grid(true)
![](image/abs.png)
-<a name='nn.Add'/>
+<a name='nn.Add'></a>
## Add ##
```lua
@@ -264,7 +265,7 @@ gives the output:
i.e. the network successfully learns the input `x` has been shifted to produce the output `y`.
-<a name="nn.Mul"/>
+<a name="nn.Mul"></a>
## Mul ##
```lua
@@ -309,7 +310,7 @@ gives the output:
i.e. the network successfully learns the input `x` has been scaled by pi.
-<a name='nn.CMul'/>
+<a name='nn.CMul'></a>
## CMul ##
```lua
@@ -362,7 +363,7 @@ gives the output:
i.e. the network successfully learns the input `x` has been scaled by those scaling factors to produce the output `y`.
-<a name="nn.Max"/>
+<a name="nn.Max"></a>
## Max ##
```lua
@@ -373,7 +374,7 @@ Applies a max operation over dimension `dimension`.
Hence, if an `nxpxq` Tensor was given as input, and `dimension` = `2` then an `nxq` matrix would be output.
-<a name="nn.Min"/>
+<a name="nn.Min"></a>
## Min ##
```lua
@@ -384,7 +385,7 @@ Applies a min operation over dimension `dimension`.
Hence, if an `nxpxq` Tensor was given as input, and `dimension` = `2` then an `nxq` matrix would be output.
-<a name="nn.Mean"/>
+<a name="nn.Mean"></a>
## Mean ##
```lua
@@ -394,7 +395,7 @@ module = nn.Mean(dimension)
Applies a mean operation over dimension `dimension`.
Hence, if an `nxpxq` Tensor was given as input, and `dimension` = `2` then an `nxq` matrix would be output.
-<a name="nn.Sum"/>
+<a name="nn.Sum"></a>
## Sum ##
```lua
@@ -405,7 +406,7 @@ Applies a sum operation over dimension `dimension`.
Hence, if an `nxpxq` Tensor was given as input, and `dimension` = `2` then an `nxq` matrix would be output.
-<a name="nn.Euclidean"/>
+<a name="nn.Euclidean"></a>
## Euclidean ##
```lua
@@ -416,7 +417,7 @@ Outputs the Euclidean distance of the input to `outputSize` centers, i.e. this l
The distance `y_j` between center `j` and input `x` is formulated as `y_j = || w_j - x ||`.
-<a name="nn.WeightedEuclidean"/>
+<a name="nn.WeightedEuclidean"></a>
## WeightedEuclidean ##
```lua
@@ -429,7 +430,7 @@ In other words, for each of the `outputSize` centers `w_j`, there is a diagonal
The distance `y_j` between center `j` and input `x` is formulated as `y_j = || c_j * (w_j - x) ||`.
-<a name="nn.Identity"/>
+<a name="nn.Identity"></a>
## Identity ##
```lua
@@ -488,7 +489,7 @@ for i = 1, 100 do -- Do a few training iterations
end
```
-<a name="nn.Copy"/>
+<a name="nn.Copy"></a>
## Copy ##
```lua
@@ -498,7 +499,7 @@ module = nn.Copy(inputType, outputType, [forceCopy, dontCast])
This layer copies the input to output with type casting from input type from `inputType` to `outputType`. Unless `forceCopy` is true, when the first two arguments are the same, the input isn't copied, only transfered as the output. The default `forceCopy` is false.
When `dontCast` is true, a call to `nn.Copy:type(type)` will not cast the module's `output` and `gradInput` Tensors to the new type. The default is false.
-<a name="nn.Narrow"/>
+<a name="nn.Narrow"></a>
## Narrow ##
```lua
@@ -507,7 +508,7 @@ module = nn.Narrow(dimension, offset, length)
Narrow is application of [narrow](https://github.com/torch/torch7/blob/master/doc/tensor.md#tensor-narrowdim-index-size) operation in a module.
-<a name="nn.Replicate"/>
+<a name="nn.Replicate"></a>
## Replicate ##
```lua
@@ -552,7 +553,7 @@ This allows the module to replicate the same non-batch dimension `dim` for both
```
-<a name="nn.Reshape"/>
+<a name="nn.Reshape"></a>
## Reshape ##
```lua
@@ -640,7 +641,7 @@ Example:
```
-<a name="nn.View"/>
+<a name="nn.View"></a>
## View ##
```lua
@@ -723,7 +724,7 @@ Example 2:
[torch.LongStorage of size 2]
```
-<a name="nn.Select"/>
+<a name="nn.Select"></a>
## Select ##
```lua
@@ -798,7 +799,7 @@ for i = 1, 10000 do -- Train for a few iterations
end
```
-<a name="nn.Exp"/>
+<a name="nn.Exp"></a>
## Exp ##
```lua
@@ -820,7 +821,7 @@ gnuplot.grid(true)
![](image/exp.png)
-<a name="nn.Square"/>
+<a name="nn.Square"></a>
## Square ##
```lua
@@ -842,7 +843,7 @@ gnuplot.grid(true)
![](image/square.png)
-<a name="nn.Sqrt"/>
+<a name="nn.Sqrt"></a>
## Sqrt ##
```lua
@@ -864,7 +865,7 @@ gnuplot.grid(true)
![](image/sqrt.png)
-<a name="nn.Power"/>
+<a name="nn.Power"></a>
## Power ##
```lua
@@ -886,7 +887,7 @@ gnuplot.grid(true)
![](image/power.png)
-<a name="nn.MM"/>
+<a name="nn.MM"></a>
## MM ##
```lua
@@ -905,7 +906,7 @@ C = model.forward({A, B}) -- C will be of size `b x m x n`
```
-<a name="nn.BatchNormalization"/>
+<a name="nn.BatchNormalization"></a>
## BatchNormalization ##
```lua
@@ -945,7 +946,7 @@ A = torch.randn(b, m)
C = model.forward(A) -- C will be of size `b x m`
```
-<a name="nn.Padding"/>
+<a name="nn.Padding"></a>
## Padding ##
`module` = `nn.Padding(dim, pad [, nInputDim, value])`
@@ -978,7 +979,7 @@ module:forward(torch.randn(2, 3)) --batch input
```
-<a name="nn.L1Penalty"/>
+<a name="nn.L1Penalty"></a>
## L1Penalty ##
```lua
diff --git a/doc/table.md b/doc/table.md
index 91ea209..221e4c3 100755
--- a/doc/table.md
+++ b/doc/table.md
@@ -1,8 +1,9 @@
-<a name="nn.TableLayers"/>
+<a name="nn.TableLayers"></a>
# Table Layers #
This set of modules allows the manipulation of `table`s through the layers of a neural network.
This allows one to build very rich architectures:
+
* `table` Container Modules encapsulate sub-Modules:
* [`ConcatTable`](#nn.ConcatTable): applies each member module to the same input [`Tensor`](https://github.com/torch/torch7/blob/master/doc/tensor.md#tensor) and outputs a `table`;
* [`ParallelTable`](#nn.ParallelTable): applies the `i`-th member module to the `i`-th input and outputs a `table`;
@@ -35,7 +36,7 @@ pred = mlp:forward(t)
pred = mlp:forward{x, y, z} -- This is equivalent to the line before
```
-<a name="nn.ConcatTable"/>
+<a name="nn.ConcatTable"></a>
## ConcatTable ##
```lua
@@ -115,7 +116,7 @@ which gives the output (using [th](https://github.com/torch/trepl)):
```
-<a name="nn.ParallelTable"/>
+<a name="nn.ParallelTable"></a>
## ParallelTable ##
```lua
@@ -164,7 +165,7 @@ which gives the output:
```
-<a name="nn.SplitTable"/>
+<a name="nn.SplitTable"></a>
## SplitTable ##
```lua
@@ -399,7 +400,7 @@ end
```
-<a name="nn.JoinTable"/>
+<a name="nn.JoinTable"></a>
## JoinTable ##
```lua
@@ -534,7 +535,7 @@ end
```
-<a name='nn.MixtureTable'/>
+<a name='nn.MixtureTable'></a>
## MixtureTable ##
`module` = `MixtureTable([dim])`
@@ -632,7 +633,7 @@ Forwarding a batch of 2 examples gives us something like this:
```
-<a name="nn.SelectTable"/>
+<a name="nn.SelectTable"></a>
## SelectTable ##
`module` = `SelectTable(index)`
@@ -725,7 +726,7 @@ Example 2:
```
-<a name="nn.NarrowTable"/>
+<a name="nn.NarrowTable"></a>
## NarrowTable ##
`module` = `NarrowTable(offset [, length])`
@@ -765,7 +766,7 @@ Example:
```
-<a name="nn.FlattenTable"/>
+<a name="nn.FlattenTable"></a>
## FlattenTable ##
`module` = `FlattenTable()`
@@ -802,7 +803,7 @@ gives the output:
}
```
-<a name="nn.PairwiseDistance"/>
+<a name="nn.PairwiseDistance"></a>
## PairwiseDistance ##
`module` = `PairwiseDistance(p)` creates a module that takes a `table` of two vectors as input and outputs the distance between them using the `p`-norm.
@@ -885,7 +886,7 @@ end
```
-<a name="nn.DotProduct"/>
+<a name="nn.DotProduct"></a>
## DotProduct ##
`module` = `DotProduct()` creates a module that takes a `table` of two vectors as input and outputs the dot product between them.
@@ -978,7 +979,7 @@ end
```
-<a name="nn.CosineDistance"/>
+<a name="nn.CosineDistance"></a>
## CosineDistance ##
`module` = `CosineDistance()` creates a module that takes a `table` of two vectors (or matrices if in batch mode) as input and outputs the cosine distance between them.
@@ -1065,7 +1066,7 @@ end
-<a name="nn.CriterionTable"/>
+<a name="nn.CriterionTable"></a>
## CriterionTable ##
`module` = `CriterionTable(criterion)`
@@ -1115,7 +1116,7 @@ for i = 1, 20 do -- Train for a few iterations
end
```
-<a name="nn.CAddTable"/>
+<a name="nn.CAddTable"></a>
## CAddTable ##
Takes a `table` of `Tensor`s and outputs summation of all `Tensor`s.
@@ -1157,7 +1158,7 @@ m = nn.CAddTable()
```
-<a name="nn.CSubTable"/>
+<a name="nn.CSubTable"></a>
## CSubTable ##
Takes a `table` with two `Tensor` and returns the component-wise
@@ -1174,7 +1175,7 @@ m = nn.CSubTable()
[torch.DoubleTensor of dimension 5]
```
-<a name="nn.CMulTable"/>
+<a name="nn.CMulTable"></a>
## CMulTable ##
Takes a `table` of `Tensor`s and outputs the multiplication of all of them.
@@ -1192,7 +1193,7 @@ m = nn.CMulTable()
```
-<a name="nn.CDivTable"/>
+<a name="nn.CDivTable"></a>
## CDivTable ##
Takes a `table` with two `Tensor` and returns the component-wise
diff --git a/doc/training.md b/doc/training.md
index 016c7c1..1a126d3 100644
--- a/doc/training.md
+++ b/doc/training.md
@@ -1,4 +1,4 @@
-<a name="nn.traningneuralnet.dok"/>
+<a name="nn.traningneuralnet.dok"></a>
# Training a neural network #
Training a neural network is easy with a [simple `for` loop](#nn.DoItYourself).
@@ -7,19 +7,19 @@ want sometimes a quick way of training neural
networks. [StochasticGradient](#nn.StochasticGradient), a simple class
which does the job for you is provided as standard.
-<a name="nn.StochasticGradient.dok"/>
+<a name="nn.StochasticGradient.dok"></a>
## StochasticGradient ##
`StochasticGradient` is a high-level class for training [neural networks](#nn.Module), using a stochastic gradient
algorithm. This class is [serializable](https://github.com/torch/torch7/blob/master/doc/serialization.md#serialization).
-<a name="nn.StochasticGradient"/>
+<a name="nn.StochasticGradient"></a>
### StochasticGradient(module, criterion) ###
Create a `StochasticGradient` class, using the given [Module](module.md#nn.Module) and [Criterion](criterion.md#nn.Criterion).
The class contains [several parameters](#nn.StochasticGradientParameters) you might want to set after initialization.
-<a name="nn.StochasticGradientTrain"/>
+<a name="nn.StochasticGradientTrain"></a>
### train(dataset) ###
Train the module and criterion given in the
@@ -42,7 +42,7 @@ Such a dataset is easily constructed by using Lua tables, but it could any `C` o
for example, as long as required operators/methods are implemented.
[See an example](#nn.DoItStochasticGradient).
-<a name="nn.StochasticGradientParameters"/>
+<a name="nn.StochasticGradientParameters"></a>
### Parameters ###
`StochasticGradient` has several field which have an impact on a call to [train()](#nn.StochasticGradientTrain).
@@ -54,7 +54,7 @@ for example, as long as required operators/methods are implemented.
* `hookExample`: A possible hook function which will be called (if non-nil) during training after each example forwarded and backwarded through the network. The function takes `(self, example)` as parameters. Default is `nil`.
* `hookIteration`: A possible hook function which will be called (if non-nil) during training after a complete pass over the dataset. The function takes `(self, iteration)` as parameters. Default is `nil`.
-<a name="nn.DoItStochasticGradient"/>
+<a name="nn.DoItStochasticGradient"></a>
## Example of training using StochasticGradient ##
We show an example here on a classical XOR problem.
@@ -134,7 +134,7 @@ You should see something like:
[torch.Tensor of dimension 1]
```
-<a name="nn.DoItYourself"/>
+<a name="nn.DoItYourself"></a>
## Example of manual training of a neural network ##
We show an example here on a classical XOR problem.
diff --git a/doc/transfer.md b/doc/transfer.md
index c03017d..6b3be00 100755
--- a/doc/transfer.md
+++ b/doc/transfer.md
@@ -1,8 +1,8 @@
-<a name="nn.transfer.dok"/>
+<a name="nn.transfer.dok"></a>
# Transfer Function Layers #
Transfer functions are normally used to introduce a non-linearity after a parameterized layer like [Linear](simple.md#nn.Linear) and [SpatialConvolution](convolution.md#nn.SpatialConvolution). Non-linearities allows for dividing the problem space into more complex regions than what a simple logistic regressor would permit.
-<a name="nn.HardTanh"/>
+<a name="nn.HardTanh"></a>
## HardTanh ##
Applies the `HardTanh` function element-wise to the input Tensor,
@@ -26,7 +26,7 @@ gnuplot.grid(true)
![](image/htanh.png)
-<a name="nn.HardShrink"/>
+<a name="nn.HardShrink"></a>
## HardShrink ##
`module = nn.HardShrink(lambda)`
@@ -51,7 +51,7 @@ gnuplot.grid(true)
```
![](image/hshrink.png)
-<a name="nn.SoftShrink"/>
+<a name="nn.SoftShrink"></a>
## SoftShrink ##
`module = nn.SoftShrink(lambda)`
@@ -77,7 +77,7 @@ gnuplot.grid(true)
![](image/sshrink.png)
-<a name="nn.SoftMax"/>
+<a name="nn.SoftMax"></a>
## SoftMax ##
Applies the `Softmax` function to an n-dimensional input Tensor,
@@ -99,7 +99,7 @@ gnuplot.grid(true)
Note that this module doesn't work directly with [ClassNLLCriterion](criterion.md#nn.ClassNLLCriterion), which expects the `nn.Log` to be computed between the `SoftMax` and itself. Use [LogSoftMax](#nn.LogSoftMax) instead (it's faster).
-<a name="nn.SoftMin"/>
+<a name="nn.SoftMin"></a>
## SoftMin ##
Applies the `Softmin` function to an n-dimensional input Tensor,
@@ -119,7 +119,7 @@ gnuplot.grid(true)
```
![](image/softmin.png)
-<a name="nn.SoftPlus"/>
+<a name="nn.SoftPlus"></a>
### SoftPlus ###
Applies the `SoftPlus` function to an n-dimensioanl input Tensor.
@@ -138,7 +138,7 @@ gnuplot.grid(true)
```
![](image/softplus.png)
-<a name="nn.SoftSign"/>
+<a name="nn.SoftSign"></a>
## SoftSign ##
Applies the `SoftSign` function to an n-dimensioanl input Tensor.
@@ -156,7 +156,7 @@ gnuplot.grid(true)
```
![](image/softsign.png)
-<a name="nn.LogSigmoid"/>
+<a name="nn.LogSigmoid"></a>
## LogSigmoid ##
Applies the `LogSigmoid` function to an n-dimensional input Tensor.
@@ -176,7 +176,7 @@ gnuplot.grid(true)
![](image/logsigmoid.png)
-<a name="nn.LogSoftMax"/>
+<a name="nn.LogSoftMax"></a>
## LogSoftMax ##
Applies the `LogSoftmax` function to an n-dimensional input Tensor.
@@ -195,7 +195,7 @@ gnuplot.grid(true)
```
![](image/logsoftmax.png)
-<a name="nn.Sigmoid"/>
+<a name="nn.Sigmoid"></a>
## Sigmoid ##
Applies the `Sigmoid` function element-wise to the input Tensor,
@@ -214,7 +214,7 @@ gnuplot.grid(true)
```
![](image/sigmoid.png)
-<a name="nn.Tanh"/>
+<a name="nn.Tanh"></a>
## Tanh ##
Applies the `Tanh` function element-wise to the input Tensor,
@@ -231,7 +231,7 @@ gnuplot.grid(true)
```
![](image/tanh.png)
-<a name="nn.ReLU"/>
+<a name="nn.ReLU"></a>
## ReLU ##
Applies the rectified linear unit (`ReLU`) function element-wise to the input Tensor,
@@ -253,7 +253,7 @@ gnuplot.grid(true)
```
![](image/relu.png)
-<a name="nn.PReLU"/>
+<a name="nn.PReLU"></a>
## PReLU ##
Applies parametric ReLU, which parameter varies the slope of the negative part:
@@ -267,7 +267,7 @@ Note that weight decay should not be used on it. For reference see http://arxiv.
![](image/prelu.png)
-<a name="nn.AddConstant"/>
+<a name="nn.AddConstant"></a>
## AddConstant ##
Adds a (non-learnable) scalar constant. This module is sometimes useful for debuggging purposes: `f(x)` = `x + k`, where `k` is a scalar.
@@ -278,7 +278,7 @@ m=nn.AddConstant(k,true) -- true = in-place, false = keeping separate state.
```
In-place mode restores the original input value after the backward pass, allowing it's use after other in-place modules, like [MulConstant](#nn.MulConstant).
-<a name="nn.MulConstant"/>
+<a name="nn.MulConstant"></a>
## MulConstant ##
Multiplies input tensor by a (non-learnable) scalar constant. This module is sometimes useful for debuggging purposes: `f(x)` = `k * x`, where `k` is a scalar.
diff --git a/mkdocs.yml b/mkdocs.yml
new file mode 100644
index 0000000..f38456d
--- /dev/null
+++ b/mkdocs.yml
@@ -0,0 +1,18 @@
+site_name: nn
+theme : simplex
+repo_url : https://github.com/torch/nn
+use_directory_urls : false
+markdown_extensions: [extra]
+docs_dir : doc
+pages:
+- [index.md, Home]
+- [module.md, Modules, Module Interface]
+- [containers.md, Modules, Containers]
+- [transfer.md, Modules, Transfer Functions]
+- [simple.md, Modules, Simple Layers]
+- [table.md, Modules, Table Layers]
+- [convolution.md, Modules, Convolution Layers]
+- [criterion.md, Criterion, Criterions]
+- [overview.md, Additional Documentation, Overview]
+- [training.md, Additional Documentation, Training]
+- [testing.md, Additional Documentation, Testing]