Learn Python
Learn Data Structure & Algorithm
Learn Numpy
Learn Pandas
Learn Matplotlib
Learn Seaborn
Learn Statistics
Learn Math
Learn MATLAB
introduction
Setup
Read data
Data preprocessing
Data cleaning
Handle date-time column
Handling outliers
Encoding
Feature_Engineering
Feature selection filter methods
Feature selection wrapper methods
Multicollinearity
Data split
Feature scaling
Supervised Learning
Regression
Classification
Bias and Variance
Overfitting and Underfitting
Regularization
Ensemble learning
Unsupervised Learning
Clustering
Association Rule
Common
Model evaluation
Cross Validation
Parameter tuning
Code Exercise
Car Price Prediction
Flight Fare Prediction
Diabetes Prediction
Spam Mail Prediction
Fake News Prediction
Boston House Price Prediction
Learn Github
Learn OpenCV
Learn Deep Learning
Learn MySQL
Learn MongoDB
Learn Web scraping
Learn Excel
Learn Power BI
Learn Tableau
Learn Docker
Learn Hadoop
In the wrapper technique, we wrap the feature together and try to find which combination of features is good
for our model.
For example, we have 5 features. So we will try to find the best combination among these 5 features. For the
first time, we use the combination of numbers 1,2, and 4 features and send it to a model. Then let's take 3,4
features and send them to the model. Then let's take numbers 2,3 and 5 features. So this way we make a
combination of features and pass the combination to a model and try to find the best combination of features
that will give good accuracy.
Suppose we get numbers 1,4 and 5 features as the best combination. So for training the model, we will take
numbers 1, 4, and 5 features and others will be dropped.
There is three different methods to find the best combination of features:
To understand the below methods let's consider S total 4 features
1. Forward method:
Step 1:At first, choose a significance level.
Then we start with a null model and fit features to get the best combination
Step 2 : Here randomly take individual features at a time(let's take no 2 features the first time, then
3 then 1, and then 4 ) and individually will pass them to the model for testing. This method will select that
individual feature that has a minimum p-value. Let's say that it takes no 3 features which have a minimum
p-value.
Step 3 : From the second iteration, it already selects a feature(no 3). Now with this earlier selected
feature, it will try to make the combination of the remaining(2,3 and 4) feature.
For example, no 3 feature is already taken so at first it will randomly take one feature(like no 1) and then
no 2 feature and then no 4 feature will pass to the model to find the best combination features. It will again
select a combination according to the minimum p-value. Let's assume it takes 3,1 features combination because
it has a minimum p-value.
Step 4: Now if the total p-value of the combination of two features(from the second step 3,1 feature)
which is selected earlier is less than the significance level then this step will happen. Now two features
combination(3,1) is already selected. Now it will perform three feature combinations. Because there are only 4
features and the p-value of the earlier combination is less than the significance level so this is the last
combination making step. If there is more feature left and the p-value < significance level is then it will
do more work.
If we have 5 features then:
At first single feature combination.
Then two feature combinations.
If p-value < significance level in earlier combination, then three feature combination.
If again p-value < significance level in earlier combination, then four feature combination.
Repeat this process until we have a set of selected features where the individual feature's p-value is less
than the significance level.
2. Backward method:
Step 1:At first, choose a significance level.
Then we start with a full model (including all the independent variables).
For example:
F1+F2+F3+F4
F1+F2+F4+F5
F1+F3+F4+F5
F2+F3+F4+F5
According to this combination those combinations which summation of each feature's p-value is greater than the
significance level will be removed and which one is less will be choose.
Step 2:
Suppose F1+F2+F3+F4 this combination is selected.
So again make a combination of these four features.
For example:
F1+F2+F3
F1+F2+F4
F1+F3+F4
F2+F3+F4
According to this combination again those combinations which summation of each feature p value is greater than
the significance level will be removed and which one is less will be chosen.
Step 3:
Suppose F1+F2+F3 this combination is selected.
So again make a combination of these four features.
For example:
F1+F2
F1+F3
F2+F3
According to this combination again those combinations which summation of each feature p value is greater than
the significance level will be removed and which one is less will be chosen.
So this process will happen until we get the final set of significant features.
3. Bi-directional method:
It is a combination of forward selection and backward elimination.
To add a feature to the combination we
use the forward method and to eliminate a model we use the backward method.
Choose a significant level to element or add a feature to the combination. So we can say that while adding or
eliminating a feature it always checks the significance level. If the p-value of combination is less than
significance level then it adds a feature and if greater then it eliminates the feature. This process will
happen until we get a final optimal set of features.