numpy - Error in backpropagation python neural net -
darn thing won't learn. weights seem become nan.
i haven't played different numbers of hidden layers/inputs/outputs bug appears consistent across different sizes of hidden layer.
from __future__ import division import numpy import matplotlib import random class net: def __init__(self, *sizes): sizes = list(sizes) sizes[0] += 1 self.sizes = sizes self.weights = [numpy.random.uniform(-1, 1, (sizes[i+1],sizes[i])) in range(len(sizes)-1)] @staticmethod def activate(x): return 1/(1+numpy.exp(-x)) def y(self, x_): x = numpy.concatenate(([1], numpy.atleast_1d(x_.copy()))) o = [x] #o[i] (activated) output of hidden layer i, "hidden layer 0" inputs weight in self.weights[:-1]: x = weight.dot(x) x = net.activate(x) o.append(x) o.append(self.weights[-1].dot(x)) return o def __call__(self, x): return self.y(x)[-1] def delta(self, x, t): o = self.y(x) delta = [(o[-1]-t) * o[-1] * (1-o[-1])] i, weight in enumerate(reversed(self.weights)): delta.append(weight.t.dot(delta[-1]) * o[-i-2] * (1-o[-i-2])) delta.reverse() return o, delta def train(self, inputs, outputs, epochs=100, rate=.1): epoch in range(epochs): pairs = zip(inputs, outputs) random.shuffle(pairs) x, t in pairs: #shuffle? subset? o, d = self.delta(x, t) layer in range(len(self.sizes)-1): self.weights[layer] -= rate * numpy.outer(o[layer+1], d[layer]) n = net(1, 4, 1) x = numpy.linspace(0, 2*3.14, 10) t = numpy.sin(x) matplotlib.pyplot.plot(x, t, 'g') matplotlib.pyplot.plot(x, map(n, x), 'r') n.train(x, t) print n.weights matplotlib.pyplot.plot(x, map(n, x), 'b') matplotlib.pyplot.show()
i haven't looked particular bug in code, can please try following things narrow down problem further? otherwise tedious find needle in haystack.
1) please try use real dataset have idea expect, e.g., mnist, and/or standardize data, because weights may become nan if become small.
2) try different learning rates , plot cost function vs. epochs check if converging. should (note used minibatch learning , averaged minibatch chunks each epoch).
3) see using sigmoid activation, implementation correct, make numerically more stable, replace 1.0 / (1.0 + np.exp(-z))
expit(z)
scipy.special
(same function more efficient).
4) implement gradient checking. here, compare analytical solution numerically approximated gradient
or better approach yields more accurate approximation of gradient compute symmetric (or centered) difference quotient given two-point formula
ps: if interested , find useful, have working vanilla numpy neural net implemented here.
Comments
Post a Comment