Arduino Basics: Neural Network (Part 6) : Back Propagation, a worked example

## 14 August 2011

### Neural Network (Part 6) : Back Propagation, a worked example

A worked example of a Back-propagation training cycle.

In this example we will create a 2 layer network (as seen above), to accept 2 readings, and produce 2 outputs. The readings are (0,1) and the expectedOutputs in this example are (1,0).

Step 1: Create the network

NeuralNetwork NN = new NeuralNetwork();
float[] expectedOutputs = {1,0};

This neural network will have randomised weights and biases when created.
Let us assume that the network generates the following random variables:

LAYER1.Neuron1
Layer1.Neuron1.Connection1.weight = cW111 = 0.3
Layer1.Neuron1.Connection2.weight = cW112 = 0.8
Layer1.Neuron1.Bias = bW11 = 0.5

LAYER1.Neuron2
Layer1.Neuron2.Connection1.weight = cW121 =  0.1
Layer1.Neuron2.Connection2.weight = cW122 =  0.1
Layer1.Neuron2.Bias = bW12 = 0.2

LAYER2.Neuron1
Layer2.Neuron1.Connection1.weight = cW211 = 0.6
Layer2.Neuron1.Connection2.weight = cW212 = 0.4
Layer2.Neuron1.Bias = bW21 = 0.4

LAYER2.Neuron2
Layer2.Neuron2.Connection1.weight = cW221 = 0.9
Layer2.Neuron2.Connection2.weight = cW222 = 0.9
Layer2.Neuron2.Bias = bW22 = 0.5

Step 2: Process the Readings through the Neural Network

a) Provide the Readings to the first layer, and calculate the neuron outputs

The readings provided to the neural network is (0,1), which go straight through to the first layer (layer1).
Starting with Layer 1:
Layer1.INPUT1 = 0
Layer1.INPUT2 =1

Calculate Layer1.Neuron1.NeuronOutput
ConnExit (cEx111) = ConnEntry (cEn111)  x Weight (cW111) = 0 x 0.3 = 0;
ConnExit (cEx112) = ConnEntry (cEn112)  x Weight (cW112) = 1 x 0.8 = 0.8;
Bias (bEx11) = ConnEntry (1) x Weight (bW11) = 1 x 0.4 = 0.4
NeuronInputValue11 = 0 + 0.8 + 0.4 = 1.2
NeuronOutputValue11 = 1/(1+EXP(-1 x 1.2)) = 0.768525

Calculate Layer1.Neuron2.NeuronOutput
ConnExit (cEx121) = ConnEntry (cEn121)  x Weight (cW121) = 0 x 0.1 = 0;
ConnExit (cEx122) = ConnEntry (cEn122)  x Weight (cW122) = 1 x 0.1 = 0.1;
Bias (bEx12) = ConnEntry (1) x Weight (bW12) = 1 x 0.2 = 0.2
NeuronInputValue12 = 0 + 0.1 + 0.2 = 0.3
NeuronOutputValue12 = 1/(1+EXP(-1 x 0.3)) = 0.574443

b) Provide LAYER2 with Layer 1 Outputs.

Now lets move to  Layer 2:
Layer2.INPUT1 = NeuronOutputValue11 = 0.768525
Layer2.INPUT2 = NeuronOutputValue12 = 0.574443

Calculate Layer2.Neuron1.NeuronOutput
ConnExit (cEx211) = (cEn211)  x Weight (cW211) = 0.768525 x 0.6 = 0.461115;
ConnExit (cEx212) = (cEn212)  x Weight (cW212) = 0.574443 x 0.4 = 0.229777;
Bias (bEx21) = ConnEntry (1) x Weight (bW21) = 1 x 0.4 = 0.4
NeuronInputValue21 = 0.461115 + 0.229777 + 0.4 = 1.090892
NeuronOutputValue21 = 1/(1+EXP(-1 x 1.090892)) = 0.74855

Calculate Layer2.Neuron2.NeuronOutput
ConnExit (cEx221) = (cEn221)  x Weight (cW221) = 0.768525  x 0.1 = 0.076853;
ConnExit (cEx222) = (cEn222)  x Weight (cW222) = 0.574443  x 0.1 = 0.057444;
Bias(bEx22) = ConnEntry (1) x Weight (bW22) = 1 x 0.5 = 0.5
NeuronInputValue22 = 0.076853 + 0.057444 + 0.5 = 0.634297
NeuronOutputValue22 = 1/(1+EXP(-1 x 0.634297)) = 0.653463

Step 3) Calculate the delta error for neurons in layer 2
-Because layer 2 is the last layer in this neural network -
we will use the expected output data (1,0) to calculate the delta error.

LAYER2.Neuron1:
Let Layer2.ExpectedOutput1 = eO21 = 1
Layer2.ActualOutput1= aO21 = NeuronOutputValue21= 0.74855
Layer2.Neuron1.deltaError1 = dE21

dE21 =     aO21       x      (1 - aO21)     x  (eO21 - aO21)
=  (0.74855)  x  (1 - 0.74855)  x  (1 - 0.74855)
= (0.74855)  x     (0.25145)     x    (0.25145)
= 0.047329

LAYER2.Neuron2:
Let Layer2.ExpectedOutput2 = eO22 = 0
Layer2.ActualOutput2     = aO22 = NeuronOutputValue22 = 0.653463
Layer2.Neuron2.deltaError = dE22

dE22  =      aO22       x      (1 - aO22)       x  (eO22 - aO22)
= (0.653463)  x  (1 - 0.653463)  x  (0 - 0.653463)
= (0.653463)  x     (0.346537)     x    (-0.653463)
= -0.14797

Step 4) Calculate the delta error for neurons in layer 1

LAYER1.Neuron1 delta Error calculation

Let              Layer1.Neuron1.deltaError  = dE11
Layer1.actualOutput1  = aO11 = NeuronOutputValue11 =  0.768525
Layer2.Neuron1.Connection1.weight = cW211   =  0.6
Layer2.Neuron1.deltaError = dE21 =  0.047329
Layer2.Neuron2.Connection1.weight = cW221   =  0.9
Layer2.Neuron2.deltaError = dE22 = -0.14797

dE11 = (aO11)          x  (1 -   aO11)         x ( [cW211   x   dE21]      +   [cW221  x    dE22] )
= (0.768525) x   (1 - 0.768525)     x   ([0.6        x  0.047329]  +   [  0.9      x  -0.14797]  )
= -0.01864

LAYER1.Neuron2 delta Error calculation

Let              Layer1.Neuron2.deltaError  = dE12
Layer1.actualOutput2  = aO12    = NeuronOutputValue12 =  0.574443
Layer2.Neuron1.Connection2.weight = cW212   =  0.4
Layer2.Neuron1.deltaError = dE21 =  0.047329
Layer2.Neuron2.Connection2.weight = cW222   =  0.9
Layer2.Neuron2.deltaError = dE22 = -0.14797

dE12  = (aO12)          x  (1 -   aO12)         x ( [cW212  x     dE21]      +   [cW222  x    dE22] )
= (0.574443) x   (1 - 0.574443)  x     ([0.4      x  0.047329]  +      [  0.9      x  -0.14797]  )
= -0.02793

Step 5) Update Layer_2 neuron connection weights and bias (with a learning rate (LR) = 0.1)

Layer 2, Neuron 1 calculations:

Let
Layer2.Neuron1.Connection1.New_weight = New_cW211
Layer2.Neuron1.Connection1.Old_weight   =   Old_cW211   = 0.6
Layer2.Neuron1.Connection1.connEntry =                 cEn211 = 0.768525
Layer2.Neuron1.deltaError =                                       dE21 = 0.047329

New_cW211 = Old_cW211 + (LR x cEn211 x dE21)
=    0.6            + (0.1 x 0.768525 x 0.047329)
=    0.6            + ( 0.003627)
=    0.603627

Layer2.Neuron1.Connection2.New_weight = New_cW212
Layer2.Neuron1.Connection2.Old_weight   =   Old_cW212 = 0.4
Layer2.Neuron1.Connection2.connEntry =                cEn212 = 0.574443
Layer2.Neuron1.deltaError =                                      dE21 = 0.047329

New_cW212 = Old_cW212 + (LR x cEn212 x dE21)
=    0.4            + (0.1 x 0.574443 x 0.047329)
=    0.4            + (0.002719)
=    0.402719

Layer2.Neuron1.New_Bias = New_Bias21
Layer2.Neuron1.Old_Bias =    Old_Bias21 = 0.4
Layer2.Neuron1.deltaError =             dE21 = 0.047329

New_Bias21 = Old_Bias21 + (LR x  1  x  de21)
=  0.4              + (0.1 x 1  x 0.047329)
=  0.4              + (0.0047329)
=  0.4047329

--------------------------------------------------------------------

Layer 2, Neuron 2 calculations:

Layer2.Neuron2.Connection1.New_weight = New_cW221
Layer2.Neuron2.Connection1.Old_weight =    Old_cW221 = 0.9
Layer2.Neuron2.Connection1.connEntry =               cEn221 = 0.768525
Layer2.Neuron2.deltaError =                                     dE22 = -0.14797

New_cW221 = Old_cW221 + (LR x cEn221 x dE22)
=    0.9            + (0.1 x 0.768525 x -0.14797)
=    0.9            + ( -0.01137)
=    0.88863

Layer2.Neuron2.Connection2.New_weight = New_cW222
Layer2.Neuron2.Connection2.Old_weight =    Old_cW222 = 0.9
Layer2.Neuron2.Connection2.connEntry =              cEn222 = 0.574443
Layer2.Neuron2.deltaError =                                    dE22 = -0.14797

New_cW222 = Old_cW222 + (LR x cEn222 x dE22)
=    0.9            + (0.1 x 0.574443 x -0.14797)
=    0.9            + (-0.0085)
=    0.8915

Layer2.Neuron2.New_Bias = New_Bias22
Layer2.Neuron2.Old_Bias =    Old_Bias22 =  0.5
Layer2.Neuron2.deltaError =             dE22 = -0.14797

New_Bias22 = Old_Bias22 + (LR x  1  x  de22)
=  0.5              + (0.1 x  1  x  -0.14797)
=  0.5            +   (-0.014797)
=  0.485203

--------------------------------------------------------------------------

Step 6) Update Layer_1 neuron connection weights and bias.

Layer 1, Neuron 1 calculations:

Let
Layer1.Neuron1.Connection1.New_weight = New_cW111
Layer1.Neuron1.Connection1.Old_weight   =   Old_cW111   =  0.3
Layer1.Neuron1.Connection1.connEntry =                 cEn111 = 0
Layer1.Neuron1.deltaError =                                       dE11 = -0.01864

New_cW111 = Old_cW111 + (LR   x  cEn111   x   dE11)
=  0.3              +   (0.1   x     0      x    -0.01864)
=  0.3              +   ( 0 )
=  0.3

Layer1.Neuron1.Connection2.New_weight = New_cW112
Layer1.Neuron1.Connection2.Old_weight   =   Old_cW112 = 0.8
Layer1.Neuron1.Connection2.connEntry =               cEn112 = 1
Layer1.Neuron1.deltaError =                                      dE11 = -0.01864

New_cW112 = Old_cW112 + (LR   x  cEn112   x   dE11)
=  0.8    +            (0.1     x    1     x     -0.01864)
=  0.8    +            (-0.001864)
=  0.798136

Layer1.Neuron1.New_Bias = New_Bias11
Layer1.Neuron1.Old_Bias =    Old_Bias11 = 0.5
Layer1.Neuron1.deltaError =             dE11 = -0.01864

New_Bias11 = Old_Bias11 + (LR   x  1   x  dE11)
=  0.5              + (0.1   x 1   x -0.01864 )
=  0.5              + (-0.001864)
=  0.498136

--------------------------------------------------------------------

Layer 1, Neuron 2 calculations:

Layer1.Neuron2.Connection1.New_weight = New_cW121
Layer1.Neuron2.Connection1.Old_weight =    Old_cW121 = 0.1
Layer1.Neuron2.Connection1.connEntry =               cEn121 = 0
Layer1.Neuron2.deltaError =                                     dE12 =   -0.02793

New_cW121 = Old_cW121 + (LR  x  cEn121 x dE12)
=  0.1               + (0.1  x     0     x  -0.02793 )
=  0.1   +   (0)
=  0.1

Layer1.Neuron2.Connection2.New_weight = New_cW122
Layer1.Neuron2.Connection2.Old_weight =    Old_cW122 = 0.1
Layer1.Neuron2.Connection2.connEntry =              cEn122 = 1
Layer1.Neuron2.deltaError =                                    dE12 =  -0.02793

New_cW122 = Old_cW122 + (LR  x  cEn122  x   dE12)
=  0.1                + (0.1   x    1      x  -0.02793)
=  0.1    +  (-0.002793)
=  0.097207

Layer1.Neuron2.New_Bias = New_Bias12
Layer1.Neuron2.Old_Bias =    Old_Bias12 =  0.2
Layer1.Neuron2.deltaError =             dE12 =  -0.02793

New_Bias12 = Old_Bias12 + (LR    x  1  x  de12)
=  0.2             +   (0.1  x  1  x  -0.02793)
=  0.2             +  (-0.002793)
=  0.197207

----------------------------------------------------------------------

All done. That was just one training cycle. Thank goodness we have computers !
A computer can process these calculations really quickly, and depending on how complicated your neural network is (ie. number of layers, and number of neurons per layer), you may find that the training procedure may take some time. But believe me, if you have designed it right, it is well worth the wait.
Because once you have the desired weights and bias values set up, you are good to go, and as you receive data, the computer can do a single forward pass in a fraction of a second, and you will get your desired output, hopefully :)

Here is a complete Processing.org script that demonstrates the use of my neural network.

If you liked my tutorial - please let me know in the comments. It is sometimes hard to know if anyone is actually reading this stuff. If you use my code in your own project, I am also happy for you to leave a link to a YouTube video etc in the comments also.

1. It's been nearly a month since this was posted, but this is exactly what I've been searching for. I'm a Cognitive Science student, and I'm trying to study neural networks a bit before I actually attend a formal lecture on them.

I made a duplicate of your code for python 3.1.2 just to see what I could do with this. This explanation is clear and concise, a step by step guide to building the foundation for a basic neural net.

Thank you! :)

2. It took me a month to get my head around this stuff, and as you can see, my neural net structure deviates slightly from the traditional feed-forward neural nets, however, the underlying equations are still there, and the flow of signals is pretty much the same.
Thankyou for the feedback !

1. can i modify this to control a mobile robot ? please answer me soon , by the way, i've been trying to learn NN for a year, this 2 hours i spent here were the only time i actually understand it , thanks a lot sir

2. Hi thugonomics, thanks for that comment - appreciate the feedback.
I cannot really say for sure, because I have only ever used this NN for one project on this blog, which was used to detect colours. I guess with the right training, the NN may be able to control a robot, but really, I don't know.

3. Thanks for posting this Scott! I'm doing Stanford's online machine learning course at the moment (ml-class.org) and was getting to the point where I was struggling to see what was happening and wanted a worked example. This was brilliant, and exactly what I needed. Thanks!

4. Hi Nathan,

I am glad you could make sense of my coding system. This stuff is not that easy to understand, and even harder to put into a "readable" format. Good luck with your course !

5. Great .Finally I got it.Thanks to publisher-Siva ganesh.GVP PG College,vizag.

6. I've just come accross this and It's great! I studied ANN's a few years ago as part of my degree, and the lecturer took a damn long and confusing route to explain exactly what you have.

Thanks very much.

1. Thanks Schteeve

2. thanks a lot

7. very good thanks for this Guide. Sorry for my english.

I haven't red all this guide yet (I'm reading but I'm too curious) but I have a question: how many kByte a ANN could be. For example if I have 10 input and 1 output and only 1 hidden layer with up to 10 neurons, how can I calculate the memory wight in kByte?

Maybe it is a stupid question, but I have to buy a microcontroller, but I don't know ho many memory I will need.

1. Hi Giovanni,

In my examples, the ANN is not run on the Arduino, it is run on the computer using the processing language. I have not tried running this off of a microcontroller directly.
The Arduino sends the sensor data to the Computer (layer input), and the computer does all the hard work.
I'm sorry, I don't know how much memory it uses either.

2. Great project and very useful, thank you.
I have a question,Here the readings are (0,1), so as written above
Starting with Layer 1:
Layer1.INPUT1 = 0
Layer1.INPUT2 =1

but Layer1.INPUT1 and Layer1.INPUT2 can be from A/D Converter from the microcontroller Arduino ? , for example :
Layer1.INPUT1 = 510
Layer1.INPUT2 =1000

thank you

3. Hi Anonymous,

The input can be as big or as small as you want it to be. The weights will adjust. And depending on what activation function you use, these values are generally transformed into numbers between 0 and 1. But one really good way of seeing this, is to substitute your values and see what happens. Time to get the calculator out :)
Also - have a look at Part 2 of this series - it may help explain it a bit more.

8. Great project and very useful, thank you.
I have a question,Here the readings are (0,1), but it can be from ADC from the microcontroller Arduino ? , for example suppose two readings, the readings are (100,512). as you said : The Arduino sends the sensor data to the Computer (layer input), the sensor data is a float or integer ?

thank you

1. Hi Anonymous,
You can send a float or an integer (it depend on how you program it), but it is going to output a float - see Part 2 of this series, and take a look on how the activation function works.

In fact - I would recommend starting from Part 1 and working your way through to the end.. it might make more sense if you do it that way.

9. hello, thank you very much for your tutorial, I read all your post about neural network, it helps me a lot, i made a program in vb6, just before implement it in a 32 bit microcontroller, but i have a question, I DONT UNDERSTAND very well the layers configuration, you have 2 inputs and 2 neurons to process it ?, or there are 2 inputs and 4 neurons(2 neurons in layer 1 and 2 neurons in layer 2) ?
respect the image at the top, you have 2 inputs for layer 1, can i have 6 inputs and only two neurons each layer (layer 1 and layer 2). Sorry I really dont understand the layers function, because if I increase the inputs I must increase the hidden layers. so please can you answer the questions. Thanks in advance.

1. Hi Anonymous,
You have asked a very good question. In the image at the top of this post, you will notice that there are 2 layers (LAYER1 and LAYER2).
In this specific example, there are 2 Layer inputs. These inputs receive data from the previous layer. However, there is no previous layer for LAYER1, so it by default becomes the neural network input.

Each Neural Network, will have layers, each layer will have neurons, and each neuron will have connections. The layer is mainly a container to group neurons, and also to hold values that will be transmitted to each of the neurons within. It also holds values that will be used to transmit information to the next layer in the neural network.

In this example, there are 2 layers, each layer has 2 neurons. So there are a total of 4 neurons in this Neural Network. But this was only done to simplify the example.

You can have 6 inputs into the neural network, and only have 2 neurons in Layer1. Please take notice of the colours used in the image. If Layer1 had 6 inputs, then each neuron in layer 1 would need 6 connections in order to receive the signal from the layer inputs. Because Layer1 would only have 2 neurons, it will only have 2 outputs. There is one layer output per neuron.
Layer 2 would need to have 2 Inputs in order to receive the signal from Layer 1, but Layer2 could have 1000 neurons. Each neuron in Layer2 would have 2 connections to receive the signal from the layer2 inputs, and if Layer2 had 1000 neurons, then Layer2 would have 1000 outputs.

Did this help explain?
Try to follow the equations on paper... it is a whole lot easier.

2. http://arduinobasics.blogspot.com.au/2011/08/neural-network-part-3-layer.html

10. You have an inconsistency in your work. In the lines
Calculate Layer2.Neuron2.NeuronOutput
ConnExit (cEx221) = (cEn221) x Weight (cW221) = 0.768525 x 0.1 = 0.076853;
ConnExit (cEx222) = (cEn222) x Weight (cW222) = 0.574443 x 0.1 = 0.057444;
You have cW221 and cW222 as 0.1, but you initialize them earlier to 0.9.

1. Well picked up... you get 10 points :)
It has taken over 3 years for anyone to notice that (including myself).
Despite the inconsistency, you should still be able to follow along.
I hope that was the only mistake :)

11. There is some problem with the weight and it seems you have iterate using hand not computer ...

search for bW11 it is 0.4 in the calculation part but 0.5 earlier. Whilst as the weight of bias is not used when reach part 5 the 0.5 come back. It should be 0.4 ...

however, search for cW221 and cW222 which is both 0.9 earlier but change to two 0.1 in the forward calculation under step 2. But when it comes to delta error step 4 it use 0.9 instead of 0.1. I cannot adjust that in my calculation as the weights does not change when doing back propagation. Also, in step 5 it use 0.9.

I cannot verify the bias calculation in this system.

1. Is that the same error as that mentioned in the comment above ?

12. Hey man, great material! I teach robotics for kids here in Brazil and I'm gonna use it to create a class!
Tks for sharing!

1. Hi Marcelo,

Great to see it put to good use.
Just double check that it still works. It has been a while since I posted this tutorial.

Regards
Scott

13. Thanks Scott, this helped me a lot very informative...

1. I am glad that it helped.