Learn Python
Learn Data Structure & Algorithm
Learn Numpy
Learn Pandas
Learn Matplotlib
Learn Seaborn
Learn Statistics
Learn Math
Learn MATLAB
introduction
Setup
Read data
Data preprocessing
Data cleaning
Handle date-time column
Handling outliers
Encoding
Feature_Engineering
Feature selection filter methods
Feature selection wrapper methods
Multicollinearity
Data split
Feature scaling
Supervised Learning
Regression
Classification
Bias and Variance
Overfitting and Underfitting
Regularization
Ensemble learning
Unsupervised Learning
Clustering
Association Rule
Common
Model evaluation
Cross Validation
Parameter tuning
Code Exercise
Car Price Prediction
Flight Fare Prediction
Diabetes Prediction
Spam Mail Prediction
Fake News Prediction
Boston House Price Prediction
Learn Github
Learn OpenCV
Learn Deep Learning
Learn MySQL
Learn MongoDB
Learn Web scraping
Learn Excel
Learn Power BI
Learn Tableau
Learn Docker
Learn Hadoop
Regularization is used to generalize the machine learning model to reduce maximum error to get better accuracy.
The advantage of lasso regression is to avoid overfitting. Overfitting occurs when the trained model performs
well on the training data and performs poorly on the testing datasets. Lasso regression works by applying a
penalty parameter to operate the error. Lasso tries to reduce error to zero or near zero. It means after
applying lasso some independent variables will come near to zero and some will be zero. It means some
independent variable will vanish.
Now here can be a question that if some of the features get vanished then isn't it a problem? The answer is no
because only those features will delete which are not that much correlated with the target variable or
feature. Lasso helps to reduce the error, means generalizing the model, and also helps to feature
selection.
Formula: (y-Y)+α(|w|)
Here,
y=actual value
Y=Predicted value
y-Y=loss
α=penalty
w=the sum of the independent variable(individual) . Here individually means if there are 3 independent
variables then first it will do sum of 1st independent variable then second and so on.
what is penalty?
To reduce loss, just add a value like 0.1,1,2 etc. This value is called penalty.
How lasso works?
Suppose you have two data points and have to draw a best-fitted line for those two data points and the line
goes over the data points. It means after drawing the best-fitted line the error will be zero for those two
data points. Now if you test the model on a test dataset then new data points will come. After having a new
dataset now the error will become more than 0. Because the line is best for those two data points which are in
the training dataset. Now the target is to reduce the error. Lasso tries to reduce the error to zero or near
zero. According to the formula what lasso does is that lasso finds the loss and with the loss, you add the
multiplication of the penalty and squared of summation of the independent variable. By doing this lasso
reduces the error.
You have an equation :y=m1X1+m2X2+C=20X1+35X2+25.
You will see here the value of m2 is much. Now if you multiply this value with x then the strength of
prediction will be more. So this will give a big error. To reduce this error if somehow you can reduce the
value of m2 then you will be able to reduce loss. To reduce it, we use that penalty value and the squared sum
of the independent variable. After doing the calculation, you will see the error of the model is generalized
or reduced.
We use lasso when we have so many or more number features because it automatically performs feature selection.
1.Suppose there are multiple highly collinear variables. In this case, lasso will do a random selection
to select one of them. Selecting this way is not good.
2.Suppose the number of predictors is greater than the number of observations (n). In this case Lasso
will pick at most n predictors as non-zero. It will pick most n predictors as non-zero even if all predictors
are relevant (or may be used in the test set).
The advantage of ridge regression is to avoid overfitting. Overfitting occurs when the model performs well on
the training data and performs poorly on the testing datasets. Ridge regression works by applying a penalty
parameter to operate the error. Ridge tries to reduce error near to zero but not zero.
Formula:
(y-Y)+α(|w|)^2
Here,
y=actual value
Y=Predicted value
y-Y=loss
α=penalty
w=the sum of the independent variable(individually). Here individually means if there are 3 independent
variables then first it will do sum of 1st independent variable then second and so on.
What is penalty?
To reduce loss ridge adds a value like 0.1,1,2 etc. This value is called the penalty. Suppose you have two
data points and have to draw a best-fitted line for those two data points and the line goes over the data
points. it means after drawing the best-fitted line error will be zero for those two data points. Now to test
the model on the test dataset then new data points will come. After having a new data point, now the error is
more than 0. Because the line is best for those two data points which are in the training dataset. Now the
target is to reduce the error. Ridge tries to reduce the error near to zero. According to the formula what it
will do that, it will find the loss and with loss, you will add the multiplication of the penalty and squared
of summation of the independent variable. By doing this you will be able to reduce the error.
Example:
you have an equation :y=m1X1+m2X2+C=20X1+35X2+25
So if look here we can see that the value of m2 is much. Now if you multiply this value with x then the
strength of prediction will be more. So this will give a big error. To reduce this error if somehow you can
reduce the value of m2 then the loss will be reduced. To reduce it we use that penalty value and the squared
sum of the independent variable. After doing the calculation you will be able to generalize or reduce the
error of the model.
The difference between ridge and lasso is that in ridge you do square of w and in lasso, you don't. In
ridge, you reduce the error near to zero but in lasso, we reduce our value to zero or near to zero.