Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/torch/optim.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndreas Köpf <andreas.koepf@xamla.com>2016-06-09 00:14:18 +0300
committerAndreas Köpf <andreas.koepf@xamla.com>2016-06-09 00:14:18 +0300
commit9c08fde975c5998cc25d5ebf265754486dd5c160 (patch)
tree616b93ed411a9b5e96237f95c18b152180c26e8e
parent6759dc8a210b1f93184a23bda9c4ca5eb8c2b71a (diff)
Init rmsprop mean square state 'm' with 1 instead 0
With alpha near 1 (e.g. the default value 0.99) the gradient was likely scaled up by a division by a number <1 during the first few iterations. With the original impl the learning rate had to be set to a much smaller value when using rmsprop compared to plain-vanilla sgd in order not to diverge.
-rw-r--r--rmsprop.lua2
1 files changed, 1 insertions, 1 deletions
diff --git a/rmsprop.lua b/rmsprop.lua
index 8947b18..038af21 100644
--- a/rmsprop.lua
+++ b/rmsprop.lua
@@ -40,7 +40,7 @@ function optim.rmsprop(opfunc, x, config, state)
-- (3) initialize mean square values and square gradient storage
if not state.m then
- state.m = torch.Tensor():typeAs(x):resizeAs(dfdx):zero()
+ state.m = torch.Tensor():typeAs(x):resizeAs(dfdx):fill(1)
state.tmp = torch.Tensor():typeAs(x):resizeAs(dfdx)
end