Learn Python
Learn Data Structure & Algorithm
Learn Numpy
Learn Pandas
Learn Matplotlib
Learn Seaborn
Learn Statistics
Learn Math
Learn MATLAB
introduction
Setup
Read data
Data preprocessing
Data cleaning
Handle date-time column
Handling outliers
Encoding
Feature_Engineering
Feature selection filter methods
Feature selection wrapper methods
Multicollinearity
Data split
Feature scaling
Supervised Learning
Regression
Classification
Bias and Variance
Overfitting and Underfitting
Regularization
Ensemble learning
Unsupervised Learning
Clustering
Association Rule
Common
Model evaluation
Cross Validation
Parameter tuning
Code Exercise
Car Price Prediction
Flight Fare Prediction
Diabetes Prediction
Spam Mail Prediction
Fake News Prediction
Boston House Price Prediction
Learn Github
Learn OpenCV
Learn Deep Learning
Learn MySQL
Learn MongoDB
Learn Web scraping
Learn Excel
Learn Power BI
Learn Tableau
Learn Docker
Learn Hadoop
Feature scaling is a method to scale multiple numeric features in the same scale or range(like -1 to 1, 0 to 1). It's mean that you can have features on a different scale and for doing machine learning work you have to bring all those features in one or a particular range or scale. It is also called data normalization.
The scale of raw features is different according to its units. The features can be in different unit's like
some features are can be in meter some can be in kg or feet, and machine learning algorithms can't understand
features units, it understands only numbers, which means it doesn't matter that what are the units of features
because our ml model doesn't understand those, it only understand numbers. So if the features units are
doesn't in a particular range or scale then it will do a bad impact on the machine learning model and as a
result, the machine learning model will not be able to give a good prediction. That's why feature scaling is
needed.
For Example, two people's height is given, the first person's height is 150cm and the other is 8.2 feet. Now
if I ask that who is taller then you can easily say 8.2 feet. You can say the correct answer because you
understand the units but in this case, the machine learning model will say 150cm. Because it doesn't
understand units, it understands numbers and here 150 is greater than 8.2. To solve this problem you have to
bring this unit into one scale or range.
Important Note: Some algorithms required feature scaling and some algorithms
doesn't. Algorithms like Linear Regression, Logistic Regression, K-Nearest Neighbors (KNN), Support Vector
Machines (SVM), and Neural Networks that use gradient descent for optimization required feature scaling. We
can say that those algorithm which works using distance and optimization techniques required feature
scaling.
Decision Trees, Random Forests, and Gradient Boosting algorithms, Naive Bayes, rule-based
algorithms like FP growth, Association rule doesn't required feature scaling.
Types of features scaling:
1. Min max scaler
2. Standard scaler
3. Robust scaler
Standardization rescales the feature such as mean is 0 and the standard deviation is 1. It's mean that it will
scale the data and after scaling the mean of the data will be 0 and the standard deviation will be 1.
Formula:
z=(x- μ)/σ
Here,
x=That value which you want to scale from a particular column
μ=mean of that column, from where you took X
σ=The standard deviation of that column, from where you took X First find the mean of data, then find the
standard deviation. Then subtract mean from x and divide that with deviation.
This is used when data follow normal/gaussian/bell curve distribution. There is no rule that we only use standard deviation when data is in normal distribution but the maximum time people use this. If the original distribution is normal or skewed then the standard distribution will also be skewed or normal.
Normalization rescales the feature in a fixed range between -1 to 1 or 0 to 1 where the mean is μ=0. It's mean
that it will rescale the features into a fixed scale and which is only -1 to 1 or 0 to 1. It's doesn't matter
that what is the unit of feature, it will convert all units into a fixed scale which is -1 to 1 or 0 to 1.
Normalization is also called as Min-Max scaling. If data doesn't follow normal/bell curve/ gaussian
distribution then normalization is used. There is no rule to use this only in that particular situation, it
can be used any time but maximum time in that situation people use it.
Formula:
X=(X-Xmin)/(Xmax-Xmin)
Here,
X=That value that we want to scale from a particular column
Xmin=The minimum value of that column, from where you took X
Xmax=The maximum value of that column,from where you took X
Standardization | Normalization |
---|---|
Here the mean will be always 0 and the standard deviation will be 1 | Here we always scale the data between 0 to 1 or -1 to 1 |
There is no thumb rule to use Standardization | There is no thumb rule to use Normalization |
Mostly Standardization is used for clustering analysis, algorithms related to distance, etc when our data follow normal/gaussian/bell curve distribution. | Normalization is mostly used for image preprocessing because there the pixel intensity is between 0 to 255, neural network algorithms data in scale 0-1, K-nearest neighbors, and when our data don't follow normal/gaussian/bell curve distribution. |
It is one of the best scaling techniques. If we have outliers in our dataset then use this technique. It scales the data accordingly to the interquartile range (IQR = 75 Quartile — 25 Quartile). Here the IQR is the range between the 1st quartile (25th quartile) and the 3rd quartile (75th quartile). In this method, we subtract all the data points with the median value and after that we divide it by the IQR(Inter Quartile Range) value.
Use robust scaler when we have outliers present in our dataset