numpy - Error in backpropagation python neural net -


darn thing won't learn. weights seem become nan.

i haven't played different numbers of hidden layers/inputs/outputs bug appears consistent across different sizes of hidden layer.

from __future__ import division import numpy import matplotlib import random  class net:     def __init__(self, *sizes):         sizes = list(sizes)         sizes[0] += 1         self.sizes = sizes         self.weights = [numpy.random.uniform(-1, 1, (sizes[i+1],sizes[i])) in range(len(sizes)-1)]      @staticmethod     def activate(x):             return 1/(1+numpy.exp(-x))       def y(self, x_):         x = numpy.concatenate(([1], numpy.atleast_1d(x_.copy())))         o = [x] #o[i] (activated) output of hidden layer i, "hidden layer 0" inputs         weight in self.weights[:-1]:             x = weight.dot(x)             x = net.activate(x)             o.append(x)         o.append(self.weights[-1].dot(x))         return o          def __call__(self, x):         return self.y(x)[-1]      def delta(self, x, t):         o = self.y(x)         delta = [(o[-1]-t) * o[-1] * (1-o[-1])]         i, weight in enumerate(reversed(self.weights)):             delta.append(weight.t.dot(delta[-1]) * o[-i-2] * (1-o[-i-2]))         delta.reverse()         return o, delta                  def train(self, inputs, outputs, epochs=100, rate=.1):         epoch in range(epochs):             pairs = zip(inputs, outputs)             random.shuffle(pairs)             x, t in pairs: #shuffle? subset?                  o, d = self.delta(x, t)                 layer in range(len(self.sizes)-1):                     self.weights[layer] -=  rate * numpy.outer(o[layer+1], d[layer])   n = net(1, 4, 1) x = numpy.linspace(0, 2*3.14, 10) t = numpy.sin(x) matplotlib.pyplot.plot(x, t, 'g') matplotlib.pyplot.plot(x, map(n, x), 'r') n.train(x, t) print n.weights matplotlib.pyplot.plot(x, map(n, x), 'b') matplotlib.pyplot.show() 

i haven't looked particular bug in code, can please try following things narrow down problem further? otherwise tedious find needle in haystack.

1) please try use real dataset have idea expect, e.g., mnist, and/or standardize data, because weights may become nan if become small.

2) try different learning rates , plot cost function vs. epochs check if converging. should (note used minibatch learning , averaged minibatch chunks each epoch).

enter image description here

3) see using sigmoid activation, implementation correct, make numerically more stable, replace 1.0 / (1.0 + np.exp(-z)) expit(z) scipy.special (same function more efficient).

4) implement gradient checking. here, compare analytical solution numerically approximated gradient

enter image description here

enter image description here

or better approach yields more accurate approximation of gradient compute symmetric (or centered) difference quotient given two-point formula

enter image description here

ps: if interested , find useful, have working vanilla numpy neural net implemented here.


Comments

Popular posts from this blog

c# - Binding a comma separated list to a List<int> in asp.net web api -

Delphi 7 and decode UTF-8 base64 -

html - Is there any way to exclude a single element from the style? (Bootstrap) -