14 August 2011

Neural Network (Part 6) : Back Propagation, a worked example

A worked example of a Back-propagation training cycle.




In this example we will create a 2 layer network (as seen above), to accept 2 readings, and produce 2 outputs. The readings are (0,1) and the expectedOutputs in this example are (1,0).

Step 1: Create the network

NeuralNetwork NN = new NeuralNetwork();   
NN.addLayer(2,2);
NN.addLayer(2,2);
float[] readings = {0,1};
float[] expectedOutputs = {1,0};
NN.trainNetwork(readings,expectedOutputs);

This neural network will have randomised weights and biases when created.
Let us assume that the network generates the following random variables:

LAYER1.Neuron1
Layer1.Neuron1.Connection1.weight = cW111 = 0.3
Layer1.Neuron1.Connection2.weight = cW112 = 0.8
Layer1.Neuron1.Bias = bW11 = 0.5

LAYER1.Neuron2
Layer1.Neuron2.Connection1.weight = cW121 =  0.1
Layer1.Neuron2.Connection2.weight = cW122 =  0.1
Layer1.Neuron2.Bias = bW12 = 0.2

LAYER2.Neuron1
Layer2.Neuron1.Connection1.weight = cW211 = 0.6
Layer2.Neuron1.Connection2.weight = cW212 = 0.4
Layer2.Neuron1.Bias = bW21 = 0.4

LAYER2.Neuron2
Layer2.Neuron2.Connection1.weight = cW221 = 0.9
Layer2.Neuron2.Connection2.weight = cW222 = 0.9
Layer2.Neuron2.Bias = bW22 = 0.5




Step 2: Process the Readings through the Neural Network

a) Provide the Readings to the first layer, and calculate the neuron outputs

The readings provided to the neural network is (0,1), which go straight through to the first layer (layer1).
Starting with Layer 1:
Layer1.INPUT1 = 0
Layer1.INPUT2 =1

   Calculate Layer1.Neuron1.NeuronOutput
   ConnExit (cEx111) = ConnEntry (cEn111)  x Weight (cW111) = 0 x 0.3 = 0;
   ConnExit (cEx112) = ConnEntry (cEn112)  x Weight (cW112) = 1 x 0.8 = 0.8;
   Bias (bEx11) = ConnEntry (1) x Weight (bW11) = 1 x 0.4 = 0.4
   NeuronInputValue11 = 0 + 0.8 + 0.4 = 1.2
   NeuronOutputValue11 = 1/(1+EXP(-1 x 1.2)) = 0.768525
  
  Calculate Layer1.Neuron2.NeuronOutput
   ConnExit (cEx121) = ConnEntry (cEn121)  x Weight (cW121) = 0 x 0.1 = 0;
   ConnExit (cEx122) = ConnEntry (cEn122)  x Weight (cW122) = 1 x 0.1 = 0.1;
   Bias (bEx12) = ConnEntry (1) x Weight (bW12) = 1 x 0.2 = 0.2
   NeuronInputValue12 = 0 + 0.1 + 0.2 = 0.3
   NeuronOutputValue12 = 1/(1+EXP(-1 x 0.3)) = 0.574443


b) Provide LAYER2 with Layer 1 Outputs.

Now lets move to  Layer 2:
Layer2.INPUT1 = NeuronOutputValue11 = 0.768525
Layer2.INPUT2 = NeuronOutputValue12 = 0.574443

   Calculate Layer2.Neuron1.NeuronOutput
   ConnExit (cEx211) = (cEn211)  x Weight (cW211) = 0.768525 x 0.6 = 0.461115;
   ConnExit (cEx212) = (cEn212)  x Weight (cW212) = 0.574443 x 0.4 = 0.229777;
   Bias (bEx21) = ConnEntry (1) x Weight (bW21) = 1 x 0.4 = 0.4
   NeuronInputValue21 = 0.461115 + 0.229777 + 0.4 = 1.090892
   NeuronOutputValue21 = 1/(1+EXP(-1 x 1.090892)) = 0.74855
  
  Calculate Layer2.Neuron2.NeuronOutput
   ConnExit (cEx221) = (cEn221)  x Weight (cW221) = 0.768525  x 0.1 = 0.076853;
   ConnExit (cEx222) = (cEn222)  x Weight (cW222) = 0.574443  x 0.1 = 0.057444;
   Bias(bEx22) = ConnEntry (1) x Weight (bW22) = 1 x 0.5 = 0.5
   NeuronInputValue22 = 0.076853 + 0.057444 + 0.5 = 0.634297  
   NeuronOutputValue22 = 1/(1+EXP(-1 x 0.634297)) = 0.653463



Step 3) Calculate the delta error for neurons in layer 2
     -Because layer 2 is the last layer in this neural network -
      we will use the expected output data (1,0) to calculate the delta error.
   
LAYER2.Neuron1:
Let Layer2.ExpectedOutput1 = eO21 = 1    
      Layer2.ActualOutput1= aO21 = NeuronOutputValue21= 0.74855         
      Layer2.Neuron1.deltaError1 = dE21

dE21 =     aO21       x      (1 - aO21)     x  (eO21 - aO21)
       =  (0.74855)  x  (1 - 0.74855)  x  (1 - 0.74855)
        = (0.74855)  x     (0.25145)     x    (0.25145)
        = 0.047329



LAYER2.Neuron2:
Let Layer2.ExpectedOutput2 = eO22 = 0         
      Layer2.ActualOutput2     = aO22 = NeuronOutputValue22 = 0.653463      
      Layer2.Neuron2.deltaError = dE22

dE22  =      aO22       x      (1 - aO22)       x  (eO22 - aO22)
        = (0.653463)  x  (1 - 0.653463)  x  (0 - 0.653463)
        = (0.653463)  x     (0.346537)     x    (-0.653463)
        = -0.14797




Step 4) Calculate the delta error for neurons in layer 1

LAYER1.Neuron1 delta Error calculation

Let              Layer1.Neuron1.deltaError  = dE11 
                            Layer1.actualOutput1  = aO11 = NeuronOutputValue11 =  0.768525
      Layer2.Neuron1.Connection1.weight = cW211   =  0.6
                     Layer2.Neuron1.deltaError = dE21 =  0.047329
      Layer2.Neuron2.Connection1.weight = cW221   =  0.9
                     Layer2.Neuron2.deltaError = dE22 = -0.14797

dE11 = (aO11)          x  (1 -   aO11)         x ( [cW211   x   dE21]      +   [cW221  x    dE22] )
           = (0.768525) x   (1 - 0.768525)     x   ([0.6        x  0.047329]  +   [  0.9      x  -0.14797]  )
           = -0.01864

LAYER1.Neuron2 delta Error calculation

Let              Layer1.Neuron2.deltaError  = dE12 
                            Layer1.actualOutput2  = aO12    = NeuronOutputValue12 =  0.574443
      Layer2.Neuron1.Connection2.weight = cW212   =  0.4
                     Layer2.Neuron1.deltaError = dE21 =  0.047329
      Layer2.Neuron2.Connection2.weight = cW222   =  0.9
                     Layer2.Neuron2.deltaError = dE22 = -0.14797

dE12  = (aO12)          x  (1 -   aO12)         x ( [cW212  x     dE21]      +   [cW222  x    dE22] )
           = (0.574443) x   (1 - 0.574443)  x     ([0.4      x  0.047329]  +      [  0.9      x  -0.14797]  )
           = -0.02793





Step 5) Update Layer_2 neuron connection weights and bias (with a learning rate (LR) = 0.1)


Layer 2, Neuron 1 calculations:

Let
Layer2.Neuron1.Connection1.New_weight = New_cW211
Layer2.Neuron1.Connection1.Old_weight   =   Old_cW211   = 0.6
Layer2.Neuron1.Connection1.connEntry =                 cEn211 = 0.768525
Layer2.Neuron1.deltaError =                                       dE21 = 0.047329

New_cW211 = Old_cW211 + (LR x cEn211 x dE21)
                     =    0.6            + (0.1 x 0.768525 x 0.047329)
                     =    0.6            + ( 0.003627)
                     =    0.603627



Layer2.Neuron1.Connection2.New_weight = New_cW212
Layer2.Neuron1.Connection2.Old_weight   =   Old_cW212 = 0.4
Layer2.Neuron1.Connection2.connEntry =                cEn212 = 0.574443
Layer2.Neuron1.deltaError =                                      dE21 = 0.047329

New_cW212 = Old_cW212 + (LR x cEn212 x dE21)
                     =    0.4            + (0.1 x 0.574443 x 0.047329)
                     =    0.4            + (0.002719)
                     =    0.402719



Layer2.Neuron1.New_Bias = New_Bias21
Layer2.Neuron1.Old_Bias =    Old_Bias21 = 0.4
Layer2.Neuron1.deltaError =             dE21 = 0.047329

New_Bias21 = Old_Bias21 + (LR x  1  x  de21)
                     =  0.4              + (0.1 x 1  x 0.047329)
                     =  0.4              + (0.0047329)
                     =  0.4047329


--------------------------------------------------------------------

Layer 2, Neuron 2 calculations:

Layer2.Neuron2.Connection1.New_weight = New_cW221
Layer2.Neuron2.Connection1.Old_weight =    Old_cW221 = 0.9
Layer2.Neuron2.Connection1.connEntry =               cEn221 = 0.768525
Layer2.Neuron2.deltaError =                                     dE22 = -0.14797

New_cW221 = Old_cW221 + (LR x cEn221 x dE22)
                     =    0.9            + (0.1 x 0.768525 x -0.14797)
                     =    0.9            + ( -0.01137)
                     =    0.88863


Layer2.Neuron2.Connection2.New_weight = New_cW222
Layer2.Neuron2.Connection2.Old_weight =    Old_cW222 = 0.9
Layer2.Neuron2.Connection2.connEntry =              cEn222 = 0.574443
Layer2.Neuron2.deltaError =                                    dE22 = -0.14797

New_cW222 = Old_cW222 + (LR x cEn222 x dE22)
                     =    0.9            + (0.1 x 0.574443 x -0.14797)
                     =    0.9            + (-0.0085)
                     =    0.8915


Layer2.Neuron2.New_Bias = New_Bias22
Layer2.Neuron2.Old_Bias =    Old_Bias22 =  0.5
Layer2.Neuron2.deltaError =             dE22 = -0.14797

New_Bias22 = Old_Bias22 + (LR x  1  x  de22)
                     =  0.5              + (0.1 x  1  x  -0.14797)
                     =  0.5            +   (-0.014797)
                     =  0.485203



--------------------------------------------------------------------------


Step 6) Update Layer_1 neuron connection weights and bias.

Layer 1, Neuron 1 calculations:

Let
Layer1.Neuron1.Connection1.New_weight = New_cW111
Layer1.Neuron1.Connection1.Old_weight   =   Old_cW111   =  0.3
Layer1.Neuron1.Connection1.connEntry =                 cEn111 = 0
Layer1.Neuron1.deltaError =                                       dE11 = -0.01864

New_cW111 = Old_cW111 + (LR   x  cEn111   x   dE11)
                     =  0.3              +   (0.1   x     0      x    -0.01864)
                     =  0.3              +   ( 0 )
                     =  0.3    


Layer1.Neuron1.Connection2.New_weight = New_cW112
Layer1.Neuron1.Connection2.Old_weight   =   Old_cW112 = 0.8
Layer1.Neuron1.Connection2.connEntry =               cEn112 = 1
Layer1.Neuron1.deltaError =                                      dE11 = -0.01864

New_cW112 = Old_cW112 + (LR   x  cEn112   x   dE11)
                     =  0.8    +            (0.1     x    1     x     -0.01864)
                     =  0.8    +            (-0.001864)
                     =  0.798136   


Layer1.Neuron1.New_Bias = New_Bias11
Layer1.Neuron1.Old_Bias =    Old_Bias11 = 0.5
Layer1.Neuron1.deltaError =             dE11 = -0.01864

New_Bias11 = Old_Bias11 + (LR   x  1   x  dE11)
                     =  0.5              + (0.1   x 1   x -0.01864 )
                     =  0.5              + (-0.001864)
                     =  0.498136

--------------------------------------------------------------------

Layer 1, Neuron 2 calculations:

Layer1.Neuron2.Connection1.New_weight = New_cW121
Layer1.Neuron2.Connection1.Old_weight =    Old_cW121 = 0.1
Layer1.Neuron2.Connection1.connEntry =               cEn121 = 0
Layer1.Neuron2.deltaError =                                     dE12 =   -0.02793

New_cW121 = Old_cW121 + (LR  x  cEn121 x dE12)
                     =  0.1               + (0.1  x     0     x  -0.02793 )
                     =  0.1   +   (0)
                     =  0.1




Layer1.Neuron2.Connection2.New_weight = New_cW122
Layer1.Neuron2.Connection2.Old_weight =    Old_cW122 = 0.1
Layer1.Neuron2.Connection2.connEntry =              cEn122 = 1
Layer1.Neuron2.deltaError =                                    dE12 =  -0.02793

New_cW122 = Old_cW122 + (LR  x  cEn122  x   dE12)
                     =  0.1                + (0.1   x    1      x  -0.02793)
                     =  0.1    +  (-0.002793)
                     =  0.097207



Layer1.Neuron2.New_Bias = New_Bias12
Layer1.Neuron2.Old_Bias =    Old_Bias12 =  0.2
Layer1.Neuron2.deltaError =             dE12 =  -0.02793

New_Bias12 = Old_Bias12 + (LR    x  1  x  de12)
                     =  0.2             +   (0.1  x  1  x  -0.02793)
                     =  0.2             +  (-0.002793)
                     =  0.197207


----------------------------------------------------------------------

All done. That was just one training cycle. Thank goodness we have computers !
A computer can process these calculations really quickly, and depending on how complicated your neural network is (ie. number of layers, and number of neurons per layer), you may find that the training procedure may take some time. But believe me, if you have designed it right, it is well worth the wait.
Because once you have the desired weights and bias values set up, you are good to go, and as you receive data, the computer can do a single forward pass in a fraction of a second, and you will get your desired output, hopefully :)

Here is a complete Processing.org script that demonstrates the use of my neural network.
Neural Network (Part 7): Cut and Paste Code (click here).

If you liked my tutorial - please let me know in the comments. It is sometimes hard to know if anyone is actually reading this stuff. If you use my code in your own project, I am also happy for you to leave a link to a YouTube video etc in the comments also.

To go back to the table of contents click here

19 comments:

  1. It's been nearly a month since this was posted, but this is exactly what I've been searching for. I'm a Cognitive Science student, and I'm trying to study neural networks a bit before I actually attend a formal lecture on them.

    I made a duplicate of your code for python 3.1.2 just to see what I could do with this. This explanation is clear and concise, a step by step guide to building the foundation for a basic neural net.

    Thank you! :)

    ReplyDelete
  2. It took me a month to get my head around this stuff, and as you can see, my neural net structure deviates slightly from the traditional feed-forward neural nets, however, the underlying equations are still there, and the flow of signals is pretty much the same.
    Thankyou for the feedback !

    ReplyDelete
  3. Thanks for posting this Scott! I'm doing Stanford's online machine learning course at the moment (ml-class.org) and was getting to the point where I was struggling to see what was happening and wanted a worked example. This was brilliant, and exactly what I needed. Thanks!

    ReplyDelete
  4. Hi Nathan,

    I am glad you could make sense of my coding system. This stuff is not that easy to understand, and even harder to put into a "readable" format. Good luck with your course !

    ReplyDelete
  5. Great .Finally I got it.Thanks to publisher-Siva ganesh.GVP PG College,vizag.

    ReplyDelete
  6. I've just come accross this and It's great! I studied ANN's a few years ago as part of my degree, and the lecturer took a damn long and confusing route to explain exactly what you have.

    Thanks very much.

    ReplyDelete
  7. very good thanks for this Guide. Sorry for my english.

    I haven't red all this guide yet (I'm reading but I'm too curious) but I have a question: how many kByte a ANN could be. For example if I have 10 input and 1 output and only 1 hidden layer with up to 10 neurons, how can I calculate the memory wight in kByte?

    Maybe it is a stupid question, but I have to buy a microcontroller, but I don't know ho many memory I will need.

    ReplyDelete
    Replies
    1. Hi Giovanni,

      In my examples, the ANN is not run on the Arduino, it is run on the computer using the processing language. I have not tried running this off of a microcontroller directly.
      The Arduino sends the sensor data to the Computer (layer input), and the computer does all the hard work.
      I'm sorry, I don't know how much memory it uses either.

      Delete
    2. Great project and very useful, thank you.
      I have a question,Here the readings are (0,1), so as written above
      Starting with Layer 1:
      Layer1.INPUT1 = 0
      Layer1.INPUT2 =1

      but Layer1.INPUT1 and Layer1.INPUT2 can be from A/D Converter from the microcontroller Arduino ? , for example :
      Layer1.INPUT1 = 510
      Layer1.INPUT2 =1000

      thank you

      Delete
    3. Hi Anonymous,

      The input can be as big or as small as you want it to be. The weights will adjust. And depending on what activation function you use, these values are generally transformed into numbers between 0 and 1. But one really good way of seeing this, is to substitute your values and see what happens. Time to get the calculator out :)
      Also - have a look at Part 2 of this series - it may help explain it a bit more.

      Delete
  8. Great project and very useful, thank you.
    I have a question,Here the readings are (0,1), but it can be from ADC from the microcontroller Arduino ? , for example suppose two readings, the readings are (100,512). as you said : The Arduino sends the sensor data to the Computer (layer input), the sensor data is a float or integer ?

    thank you

    ReplyDelete
    Replies
    1. Hi Anonymous,
      You can send a float or an integer (it depend on how you program it), but it is going to output a float - see Part 2 of this series, and take a look on how the activation function works.

      In fact - I would recommend starting from Part 1 and working your way through to the end.. it might make more sense if you do it that way.

      Delete
  9. hello, thank you very much for your tutorial, I read all your post about neural network, it helps me a lot, i made a program in vb6, just before implement it in a 32 bit microcontroller, but i have a question, I DONT UNDERSTAND very well the layers configuration, you have 2 inputs and 2 neurons to process it ?, or there are 2 inputs and 4 neurons(2 neurons in layer 1 and 2 neurons in layer 2) ?
    respect the image at the top, you have 2 inputs for layer 1, can i have 6 inputs and only two neurons each layer (layer 1 and layer 2). Sorry I really dont understand the layers function, because if I increase the inputs I must increase the hidden layers. so please can you answer the questions. Thanks in advance.

    ReplyDelete
    Replies
    1. Hi Anonymous,
      You have asked a very good question. In the image at the top of this post, you will notice that there are 2 layers (LAYER1 and LAYER2).
      In this specific example, there are 2 Layer inputs. These inputs receive data from the previous layer. However, there is no previous layer for LAYER1, so it by default becomes the neural network input.

      Each Neural Network, will have layers, each layer will have neurons, and each neuron will have connections. The layer is mainly a container to group neurons, and also to hold values that will be transmitted to each of the neurons within. It also holds values that will be used to transmit information to the next layer in the neural network.

      In this example, there are 2 layers, each layer has 2 neurons. So there are a total of 4 neurons in this Neural Network. But this was only done to simplify the example.

      You can have 6 inputs into the neural network, and only have 2 neurons in Layer1. Please take notice of the colours used in the image. If Layer1 had 6 inputs, then each neuron in layer 1 would need 6 connections in order to receive the signal from the layer inputs. Because Layer1 would only have 2 neurons, it will only have 2 outputs. There is one layer output per neuron.
      Layer 2 would need to have 2 Inputs in order to receive the signal from Layer 1, but Layer2 could have 1000 neurons. Each neuron in Layer2 would have 2 connections to receive the signal from the layer2 inputs, and if Layer2 had 1000 neurons, then Layer2 would have 1000 outputs.

      Did this help explain?
      Try to follow the equations on paper... it is a whole lot easier.


      Delete
    2. http://arduinobasics.blogspot.com.au/2011/08/neural-network-part-3-layer.html

      Delete
  10. You have an inconsistency in your work. In the lines
    Calculate Layer2.Neuron2.NeuronOutput
    ConnExit (cEx221) = (cEn221) x Weight (cW221) = 0.768525 x 0.1 = 0.076853;
    ConnExit (cEx222) = (cEn222) x Weight (cW222) = 0.574443 x 0.1 = 0.057444;
    You have cW221 and cW222 as 0.1, but you initialize them earlier to 0.9.

    ReplyDelete
    Replies
    1. Well picked up... you get 10 points :)
      It has taken over 3 years for anyone to notice that (including myself).
      Despite the inconsistency, you should still be able to follow along.
      I hope that was the only mistake :)

      Delete

Feel free to leave a comment about this tutorial below.
Any questions about your particular project should be asked in the ArduinoBasics forum.

Comments are moderated due to large amount of spam.