Learn Python
Learn Data Structure & Algorithm
Learn Numpy
Learn Pandas
Learn Matplotlib
Learn Seaborn
Learn Statistics
Learn Math
Learn MATLAB
Learn Machine learning
Learn Github
Learn OpenCV
Introduction
Setup
ANN
Working process ANN
Propagation
Bias parameter
Activation function
Loss function
Overfitting and Underfitting
Optimization function
Chain rule
Minima
Gradient problem
Weight initialization
Dropout
ANN Regression Exercise
ANN Classification Exercise
Hyper parameter tuning
CNN
CNN basics
Convolution
Padding
Pooling
Data argumentation
Flattening
Create Custom Dataset
Binary Classification Exercise
Multiclass Classification Exercise
Transfer learning
Transfer model Basic template
RNN
How RNN works
LSTM
Bidirectional RNN
Sequence to sequence
Attention model
Transformer model
Bag of words
Tokenization & Stop words
Stemming & Lemmatization
TF-IDF
N-Gram
Word embedding
Normalization
Pos tagging
Parser
semantic analysis
Regular expression
Learn MySQL
Learn MongoDB
Learn Web scraping
Learn Excel
Learn Power BI
Learn Tableau
Learn Docker
Learn Hadoop
Before starting we have to know two things:
If we do multiplication of
numbers which are between 0 to 1 then our answer will be less than those numbers.
For example:
Suppose we have three numbers 0.1, 0.4, 0.7. Now if we do multiplication of these three numbers then the
result will come less than these three numbers.
like: 0.1*0.4*0.7=0.028
Here we can see that our result is less than those numbers that we used for multiplication. Now if we do
multiplication of numbers that are greater than 1 then the result will be greater than those numbers.
For example:
We have three numbers 2, 5, 8. Now if we do multiplication of these three numbers then the result will be
greater than these three numbers. like:2*5*8=80
Here we can see that our result is greater than those numbers that we used for multiplication.
1.Exploding gradient problem.
2.Vanishing gradient problem.
We know that in a neural network we have an input layer, hidden layer, and output layer. In the hidden layer
and output layer, we have neurons. Here each neuron is connected with other neurons. If we have multiple
hidden layers then the first hidden layer neurons output goes to second hidden layer neurons then the second
hidden layer output goes to third hidden layer neurons and this chain work like this till we reach the output
layer.
In the first hidden layer neurons, we multiply weights with inputs and then do sum all of those results and
then add a bias. After this, we run an activation function. Now, this output will go to second-layer neurons.
There we will again add new weights and after calculation, we run an activation function. These same things
will happen to every neuron of each hidden layer. Then at last the output layer.
Now here with each neuron of the hidden layer, we are multiplying new weights every time. Now if we take a
value between 0 to 1 for weights then the output will become less than the previous weight value each time
after multiplication. If we take weight value greater than 1 then our output will become greater than the
previous weight value each time after multiplication.
Formula of optimization:
Wt=Wt-1-η{dL/dWt-1}
Here,
Wt=New weight
Wt-1=Previous weight
dL=Loss function value
dWt-1= Previous weight
η= Learning rate
When we want to optimize our model, if we take weights value greater than 1 then the value of dL/dW(t-1) will
become very high. For this reason, the new weight will never come to the minimum value because the step size
is big and we are not given any chance to stop at the minimum point.
When our value of dL/dW(t-1) is very high then it is called exploding gradient problem.
if we take weights values between 0 to 1 then the value of dL/dW(t-1) will become very low. When the value of
dL/dW(t-1) is too low then there will be no difference or we can say less difference(which is not countable)
between new weight and previous weight and
this problem is called the vanishing gradient problem.