Learn Python

Learn Data Structure & Algorithm

Learn Numpy

Learn Pandas

Learn Matplotlib

Learn Seaborn

Learn Statistics

Learn Math

Learn MATLAB

Learn Machine learning

Learn Github

Learn OpenCV

Introduction

Setup

ANN

Working process ANN

Propagation

Bias parameter

Activation function

Loss function

Overfitting and Underfitting

Optimization function

Chain rule

Minima

Gradient problem

Weight initialization

Dropout

ANN Regression Exercise

ANN Classification Exercise

Hyper parameter tuning

CNN

CNN basics

Convolution

Padding

Pooling

Data argumentation

Flattening

Create Custom Dataset

Binary Classification Exercise

Multiclass Classification Exercise

Transfer learning

Transfer model Basic template

RNN

How RNN works

LSTM

Bidirectional RNN

Sequence to sequence

Attention model

Transformer model

Bag of words

Tokenization & Stop words

Stemming & Lemmatization

TF-IDF

N-Gram

Word embedding

Normalization

Pos tagging

Parser

semantic analysis

Regular expression

Learn MySQL

Learn MongoDB

Learn Web scraping

Learn Excel

Learn Power BI

Learn Tableau

Learn Docker

Learn Hadoop

Learn everything about the working process of simple RNN

Why we need RNN?

In ANN and CNN first, we have input layer then hidden layer, and then output layer. So the input goes from the input layer to the hidden layer and some calculations happens there, then the output of the hidden layer goes to the output layer. This is the flow of ANN and CNN.
We don't see any loop here and here only we are taking or considering current input and here we don't have any memory. For this problem, ANN and CNN can't handel or deal with sequential data. ANN and CNN only considers current input but to work with sequential data we have to take or consider both previous input and current input also. As a solution, RNN is created.

How RNN solve the problem?
Here we have memory and we also use or consider both previous and current input. What memory does, it store all the previous inputs and supplies previous inputs to a hidden layer with weights.

Now let's see an example to see how RNN works?
Suppose we have four words x1, x2, x3, x4, and with these words, we have created a sentence. Now we have to identify, is it a positive sentence or a negative sentence.

In the image,
x1, x2, x3, x4 are the inputs or words.
w, w1, w2, w3, w4 are weights.
o1, o2, o3, o4, o5 are the outputs
Y is our main output and square boxes are hidden layer and inside the boxes, the circles are neuron.
Empty square box is our output layer.
t1, t2, t3, t4 are time. t1 means first time and t4 means fourth time. First Time means first first step or first iteration. t4 means fourth step and fourth iteration.

Works happen in Forward propagation of RNN

First hidden layer:
Now first input x1 and weight w go to the first hidden layer and there some calculation happens.
The calculations are: the sum of all weights and then add bias and then run an activation function. After doing this we get an output o1. All these things happen in time one/t1/first step/first iteration.

Second hidden layer:
Then in t2/second iteration/second step, we take the second input x2 and weight w and these go to the second hidden layer and in the second hidden layer, another input comes with a weight. That other input is the output that we got from the first hidden layer o1 and with that output, we add a weight w1. So previous input( o1 output, w1 weight) and current input(x2 input and w weight) go to the second hidden layer then again those calculations happen there. After the calculation, we get another output O2.

Third hidden layer:
Now for t3/third iteration/third step in the third hidden layer new x3 input and w weight will go. With those inputs again previous input will go and those are the output that we got from the second hidden layer o2 and with that output, a weight w2 also goes. So, x3,w and o2,w2 goes to the third hidden layer. Now again all those calculations will happen.

So all these steps will happen every time until we reach the output layer and in the output layer, we do the final calculation and get the output. The output layer gets one input.

Fourth hidden layer:
For the fourth word or input, we will take the third hidden layer output o3 and weight w3 as previous input and the current input will be x4,w. Here also same calculation will happens. The output of the four hidden layers will go to the output layer

Output layer:
In the output layer o4 output and w4 weight will go. There same calculation will happen and we will get the final output.

See we wanted to identify one thing is our sentence is positive or negative?
To identify we have to work with all the words and have to maintain the sequence, because if we don't maintain the sequence then we will not get the correct sentence to identify. If we want to work with the second word x2 then we must take the first word x1 and if we want to work with the third word then we must take the first and second word x1, x2 because without knowing the previous words how do we will know that why is this current word for?

In the diagram we can see that the first word x1(current input) goes to the first hidden layer and in the real scenario there is no previous output, so there we take default output o0 which value is always 0 and also take a default weight w0.

Now if we have more hidden layer, the work is same. This way forward propagation happens in RNN.

Works happen in back propagation of RNN

We have to find the loss by using the loss function. In back propagation our target to reduce this loss by updating the weights. To update weights we use optimizers.

Optimization formula: W_t=W_t-1-η{dL/dW_t-1}
Here,
W_t=New weight
W_t-1=Previous weight
dL=Loss function value
dW_t-1)= Previous weight
η= Learning rate
dL/dW_t-1=Slope

Let's see how to find the slope with the help of chain rule
This update will happen in the backward direction.
Output layer:
So the first weight is w4. w4 weight depends on o5 output. So to update the weight we have to take the o5 output value.
So the derivative of w4 weight and loss or slope is:
dl/dw4=(dl/do5)*(do5/dw4)

Now we have to put this value to the formula and then after calculation, the weights will be updated.

Fourth hidden layer:
Two weights w3 and w comes to the fourth neuron. Which one should update first. In this situation, we always update that weight which is coming from current input. Here for the number four hidden layer, the current input weight is w. So we will update it first. Here w weight depends on o4 output and o4 is depend on o5 output.

So according to the chain rule:
dl/dw=(dl/do5)*(do5/do4)*(do4/dw)

Now we have to put this value to the formula and after calculation, the weight will be updated.
Here w3 weight is depend on o4 and o5 output:
So according to the chain rule:
dl/dw3=(dl/do5)*(do5/do4)*(do4/dw3)

Now we have to put this value to the formula and after calculation, the weight will be updated.

Third hidden layer:
Here third hidden layer current input weight w is depend on o3, o4 and o5 output:
So according to the chain rule:
dl/dw=(dl/do5)*(do5/do4)*(do4/do3)*(do3/w)

Now we have to put this value to the formula and after calculation, the weights will be updated.

Here third hidden layer previous input weight w2 is depend on o3, o4 and o5 output:
So according to the chain rule:
dl/dw=(dl/do5)*(do5/do4)*(do4/do3)*(do3/w2)

Now we have to put this value to the formula and after calculation, the weights will be updated.

For other weights of other neurons, the same process will be followed. So this way back propagation happens in RNN.

CodersAim is created for learning and training a self learner to become a professional from beginner. While using CodersAim, you agree to have read and accepted our terms of use, privacy policy, Contact Us