Learn Python
Learn Data Structure & Algorithm
Learn Numpy
Learn Pandas
Learn Matplotlib
Learn Seaborn
Learn Statistics
Learn Math
Learn MATLAB
Learn Machine learning
Learn Github
Learn OpenCV
Introduction
Setup
ANN
Working process ANN
Propagation
Bias parameter
Activation function
Loss function
Overfitting and Underfitting
Optimization function
Chain rule
Minima
Gradient problem
Weight initialization
Dropout
ANN Regression Exercise
ANN Classification Exercise
Hyper parameter tuning
CNN
CNN basics
Convolution
Padding
Pooling
Data argumentation
Flattening
Create Custom Dataset
Binary Classification Exercise
Multiclass Classification Exercise
Transfer learning
Transfer model Basic template
RNN
How RNN works
LSTM
Bidirectional RNN
Sequence to sequence
Attention model
Transformer model
Bag of words
Tokenization & Stop words
Stemming & Lemmatization
TF-IDF
N-Gram
Word embedding
Normalization
Pos tagging
Parser
semantic analysis
Regular expression
Learn MySQL
Learn MongoDB
Learn Web scraping
Learn Excel
Learn Power BI
Learn Tableau
Learn Docker
Learn Hadoop
In ANN and CNN first, we have input layer then hidden layer, and then output layer. So the input goes from the
input layer to the hidden layer and some calculations happens there, then the output of the hidden layer goes
to the output layer. This is the flow of ANN and CNN.
We don't see any loop here and here only we are taking or considering current input and here we don't have any
memory. For this problem, ANN and CNN can't handel or deal with sequential data. ANN and CNN only considers
current input but to work with sequential data we have to take or consider both previous input and current
input also. As a solution, RNN is created.
How RNN solve the problem?
Here we have memory and we also use or consider both previous and current input. What memory does, it store
all the previous inputs and supplies previous inputs to a hidden layer with weights.
Now let's see an example to see how RNN works?
Suppose we have four words x1, x2, x3, x4, and with these words, we have created a sentence. Now we have to
identify, is it a positive sentence or a negative sentence.
In the image,
x1, x2, x3, x4 are the inputs or words.
w, w1, w2, w3, w4 are weights.
o1, o2, o3, o4, o5 are the outputs
Y is our main output and square boxes are hidden layer and inside the boxes, the circles are neuron.
Empty square box is our output layer.
t1, t2, t3, t4 are time. t1 means first time and t4 means fourth time. First Time means first first step or
first iteration. t4 means fourth step and fourth iteration.
First hidden layer:
Now first input x1 and weight w go to the first hidden layer and there some calculation happens.
The calculations are: the sum of all weights and then add bias and then run an activation function. After
doing this we get an output o1. All these things happen in time one/t1/first step/first iteration.
Second hidden layer:
Then in t2/second iteration/second step, we take the second input x2 and weight w and these go to the second
hidden layer and in the second hidden layer, another input comes with a weight. That other input is the output
that we got from the first hidden layer o1 and with that output, we add a weight w1. So previous input( o1
output, w1 weight) and current input(x2 input and w weight) go to the second hidden layer then again those
calculations happen there. After the calculation, we get another output O2.
Third hidden layer:
Now for t3/third iteration/third step in the third hidden layer new x3 input and w weight will go. With those
inputs again previous input will go and those are the output that we got from the second hidden layer o2 and
with that output, a weight w2 also goes. So, x3,w and o2,w2 goes to the third hidden layer. Now again all
those calculations will happen.
So all these steps will happen every time until we reach the output layer and in the output layer, we do the
final calculation and get the output. The output layer gets one input.
Fourth hidden layer:
For the fourth word or input, we will take the third hidden layer output o3 and weight w3 as previous input
and the current input will be x4,w. Here also same calculation will happens. The output of the four hidden
layers will go to the output layer
Output layer:
In the output layer o4 output and w4 weight will go. There same calculation will happen and we will get the
final output.
See we wanted to identify one thing is our sentence is positive or negative?
To identify we have to work with all the words and have to maintain the sequence, because if we don't maintain
the sequence then we will not get the correct sentence to identify. If we want to work with the second word x2
then we must take the first word x1 and if we want to work with the third word then we must take the first and
second word x1, x2 because without knowing the previous words how do we will know that why is this current
word for?
In the diagram we can see that the first word x1(current input) goes to the first hidden layer and in the real
scenario there is no previous output, so there we take default output o0 which value is always 0 and also take
a default weight w0.
Now if we have more hidden layer, the work is same. This way forward propagation happens in RNN.
We have to find the loss by using the loss function. In back propagation our target to reduce this loss by
updating the weights. To update weights we use optimizers.
Optimization formula: Wt=Wt-1-η{dL/dWt-1}
Here,
Wt=New weight
Wt-1=Previous weight
dL=Loss function value
dWt-1)= Previous weight
η= Learning rate
dL/dWt-1=Slope
Let's see how to find the slope with the help of chain rule
This update will happen in the backward direction.
Output layer:
So the first weight is w4. w4 weight depends on o5 output. So to update the weight we have to take the o5
output value.
So the derivative of w4 weight and loss or slope is:
dl/dw4=(dl/do5)*(do5/dw4)
Now we have to put this value to the formula and then after calculation, the weights will be updated.
Fourth hidden layer:
Two weights w3 and w comes to the fourth neuron. Which one should update first. In this situation, we always
update that weight which is coming from current input. Here for the number four hidden layer, the current
input weight is w. So we will update it first. Here w weight depends on o4 output and o4 is depend on o5
output.
So according to the chain rule:
dl/dw=(dl/do5)*(do5/do4)*(do4/dw)
Now we have to put this value to the formula and after calculation, the weight will be updated.
Here w3 weight is depend on o4 and o5 output:
So according to the chain rule:
dl/dw3=(dl/do5)*(do5/do4)*(do4/dw3)
Now we have to put this value to the formula and after calculation, the weight will be updated.
Third hidden layer:
Here third hidden layer current input weight w is depend on o3, o4 and o5 output:
So according to the chain rule:
dl/dw=(dl/do5)*(do5/do4)*(do4/do3)*(do3/w)
Now we have to put this value to the formula and after calculation, the weights will be updated.
Here third hidden layer previous input weight w2 is depend on o3, o4 and o5 output:
So according to the chain rule:
dl/dw=(dl/do5)*(do5/do4)*(do4/do3)*(do3/w2)
Now we have to put this value to the formula and after calculation, the weights will be updated.
For other weights of other neurons, the same process will be followed. So this way back propagation happens in
RNN.