Learn Python
Learn Data Structure & Algorithm
Learn Numpy
Learn Pandas
Learn Matplotlib
Learn Seaborn
Learn Statistics
Learn Math
Learn MATLAB
Learn Machine learning
Learn Github
Learn OpenCV
Introduction
Setup
ANN
Working process ANN
Propagation
Bias parameter
Activation function
Loss function
Overfitting and Underfitting
Optimization function
Chain rule
Minima
Gradient problem
Weight initialization
Dropout
ANN Regression Exercise
ANN Classification Exercise
Hyper parameter tuning
CNN
CNN basics
Convolution
Padding
Pooling
Data argumentation
Flattening
Create Custom Dataset
Binary Classification Exercise
Multiclass Classification Exercise
Transfer learning
Transfer model Basic template
RNN
How RNN works
LSTM
Bidirectional RNN
Sequence to sequence
Attention model
Transformer model
Bag of words
Tokenization & Stop words
Stemming & Lemmatization
TF-IDF
N-Gram
Word embedding
Normalization
Pos tagging
Parser
semantic analysis
Regular expression
Learn MySQL
Learn MongoDB
Learn Web scraping
Learn Excel
Learn Power BI
Learn Tableau
Learn Docker
Learn Hadoop
Suppose we are using mini batch gradient descent and mse loss function in back propagation.
Now let's create a conversion graph:
In the x-axis we have our weights. Here w1 is the initial weight(purple marked).
Now if we draw a tangent line on w1 then we will get the point for w1 in the
x-axis(black marked).
Now we have to increase or decrease the weight according to the derivative value so that we can come to the
center location(green marked point).
This center point or green marked point is called global minima.
Global minima are your best weight point to predict the actual output. Look when we do back propagation and in
back propagation, we try to update the weights so that we can get the best weight to predict the actual
output.
Now in the image w1 is my normal weight. Now we have to update the weight so that we can get the
best weight to predict the actual output.
Now, where is the best weight?
The best weight is on the global minima location. So in back propagation weight update happens to get the best
weight and this best weight is located in global minima point.
Here we have to remember one thing and that is, in global minima, dl(derivative of loss)/dw(derivative of w)
will be 0 or we can say that the slope is 0.
Now if we put this dl/dw=0 in the weight update formula then wnew and wold will become
equal because here dl/dw or slope is 0. If wold and wnew is equal this means, there is
no need to update the weights and it will be treated as the perfect weight.
weight update formula:
Wt=Wt-1-η{dL/dWt-1}
If we use mini-batch and MSE then we will get this type of curve that we saw in the first image. But if we use
different loss functions then we will get a different type graph.
Let's draw another graph for different loss function:
In this graph, we can see that there are a lot of curves.
Now the question is, in this case, what will be my global minima?
To do this, we have to find that which curve is very or most close to the x-axis. That curve which is most
near to the x-axis will be the global minima(green marked point). Here green marked point curve is most near
to the x-axis so it is the global minima.
Here we have too many curves.
If one curve is global minima then what are the other curve?
The answer is, the other curve is called local minima(blue marked points). These are called local minima
because in a specific area those points contain the best weight value.
Here the brown marked points are the local maxima.
So we can say that local minima are those points that contain the best weight value for a particular
location(in the graph). But global minima are those points that contain the best weight value for all
locations(in the graph).
Convex Function:
In the image of convex function we can see two region one is marked in orange color and other is marked in red
color. Now if we take two points and connect those points. Then we will see that those two points and all
those points comes between these two points are from same region.
Convex function occurs most in machine learning like linear and logistic regression technique
Non-convex Function:
In the image of convex function we can see two region one is marked in green color and other is marked in
yellow color. Now if we take two points and connect those points. Then we will see that those two points and
all those points comes between these two points are not from same region.
Non-convex function occurs most in deep learning technique.