Learn Python

Learn Data Structure & Algorithm

Learn Numpy

Learn Pandas

Learn Matplotlib

Learn Seaborn

Statistics Introduction

Statistics Variable

Statistics Sample & population

Statistics Measure of central tendency

Statistics Measure of Dispersion

Statistics Distribution

Statistics Z-score

Statistics PDF & CDF

Statistics Center Limit Theorem

Statistics Correlation

Statistics P value & hypothesis test

Statistics Counting rule

Statistics Outlier

Learn Math

Learn MATLAB

Learn Machine learning

Learn Github

Learn OpenCV

Learn Deep Learning

Learn MySQL

Learn MongoDB

Learn Web scraping

Learn Excel

Learn Power BI

Learn Tableau

Learn Docker

Learn Hadoop

Measure of central tendency in statistics

How to calculate limit, Boundaries, Tally,Frequency, Cumulative?

Class limit:
Formula: CL=( Largest number - Smallest number)/gap

Suppose you have the largest number 200 and smallest number 50, you want a gap of 5.
So CL=(200-50)/5=30
Boundary:
Take .5 less from the first value and .5 more from the second value.
Tally:
How many numbers are present in the boundary.
Frequency:
How many numbers are present in the boundary.
Cumulative frequency:
In the example, you can see, first you write 4. Here 4 is the first value of frequency. Then in the second cell of Cumulative frequency, you have 11 which is the sum of the Cumulative frequency column previous cell(4) and frequency column second cell(7). This thing will go on until the end.

Example:
Records of ages of 54 people :54,34,45,56,67,81,45,23,34,56,
76,54,32,82,84,34,56,45,76,34,21,23,85,27,76,53,44,37,82,56,78,57,41,
83,48,46,58,59,63,64,76,58,42,41,35,65,45,78,70,60,50,40,35,50

Class Limit Boundaries( Take .5 less from first value and .5 more from second value) Tally Frequency(f) Cumulative frequency()
23-29 22.5-29-5 |||| 4 4
30-36 29.5-36.5 ||||| || 7 11
37-43 36.5-43.5 |||| 4 15
44-50 43.5-50.5 ||||| ||||| 10 25
51-57 50.5-57.5 ||||| ||| 8 33
58-64 57.5-64.5 ||||| | 6 39
65-71 64.5-71.5 ||| 3 42
72-78 71.5-78.5 ||||| | 6 48
79-85 78.5-85.5 ||||| | 6 54

What is measure of central tendency?

It is a single value that represents the center point of a dataset. You can also say it as the central location of a dataset. It yields information about a particular location in a group of numbers. There are three common measures of central tendency: mean, median, and mode. Each of these measures, finds the central location of a dataset.

How to calculate mean?

Mean is represented by the μ symbol. To find the mean, do the sum of all the given values and divide the result of the sum by the total number of values.
Formula:μ=(X1+X2+...Xn)/n
Example:Suppose we have 5 numbers 1, 2, 3, 4, 5. What is the mean?
Ans:(1+2+3+4+5)/5=3

How to calculate mean for group data?


Boundaries(have to take .5 less from first value and .5 more from second value) Frequency(number of values present in the boundaries) Midpoint(find sum of first and second value of boundary and then divide it by 2 ) Frequency*Midpoint(fm)
3.5-8.5 5 6 30
8.5-13.5 6 11 66
13.5-18.5 4 16 64
n=15(addition of all the values present in this column) 16 mean=160(addition of all the values present in this column)

So, mean =fm/n=160/15=10.67

How to calculate weighted mean?

Sometimes you want to calculate the mean of numbers but you want to give more importance to some numbers than other numbers. In that case, for those numbers which you want to give more importance, you assign weight to those numbers. In this case, the mean you will get is called weighted mean.

weighted mean = (Σ wi xi)/Σ wi
Here,
xi=data point
wi=weight

Let's see an example:

Category Weight Scores
Quiz 15 88
Class Test 5 70
Mid exam 25 87
Final exam 30 99

mean={(15*88)+(5*70)+(25*87)+(30*99)}/(15+5+25+30)
           =(1320+350+2175+2970)/75
           =6815/75
           =90

How to calculate weighted average?

Sometimes you want to calculate an average of numbers but you want to give more importance to some numbers than other numbers. In that case, for those numbers which you want to give more importance, you assign weight to those numbers. In this case, the average you get is called the weighted average.

Formula: (x1w1+x2w2+x3w3+...+xNwN)/W
Here,
W=summation of all weights
w=weight assign to a particular value

Example:
Suppose jenny give a midterm exam and her score is 83 and her final term exam score is 95. Now calculate the weighted average of Ana total score where use 40% weight for the midterm exam and 60% weight for the final exam

You know that to get a good grade final exam is more important. That's why 40% is used for the midterm exam and 60% weight is used for the final exam.

Now let's see the score,
Weighted average={(83*40%)+(95*60%)}/(40%+60%)=90.2

How to calculate Median?

Median means that the center value of a list of numbers. Before finding the median you should arrange the data in ascending order. If the count of all the numbers of the list is odd then take the center value as median and if the count is even then do the sum of two center values and divide the sum by two and the result will be the median.
Ex: 1,2,3,4,5
Ans:3

Ex:1,2,3,4,5,6
Ans:(3+4)/2=3.5

Median for group data:
Formula: Median=L+[{(N/2)-cfp}/fmed]*W
Here,
L=Lower limit of the median class
cfp=Cumulative frequency of class preceding the median class
fmed=Frequency of the median class
W=Width of the median class
N=Total of frequencies

EX:
Records of ages of 54 people :54,34,45,56,67,81,45,23,34,56,
76,54,32,82,84,34,56,45,76,34,21,23,85,27,76,53,44,37,82,56,78,57,41,
83,48,46,58,59,63,64,76,58,42,41,35,65,45,78,70,60,50,40,35,50


Class Limit Boundaries Frequency(f) Cumulative frequency()
23-29 22.5-29-5 4 4
30-36 29.5-36.5 7 11
37-43 36.5-43.5 4 15
44-50 43.5-50.5 10 25
51-57 50.5-57.5 8 33
58-64 57.5-64.5 6 39
65-71 64.5-71.5 3 42
72-78 71.5-78.5 6 48
79-85 78.5-85.5 6 54

Here,
N=54
Now if you divide 54 by 2 and the result is 27. Now find where 27 is present in cumulative frequency. Here 27 is present in the 5th cell of cumulative class because the 4th cell is 25 where 27 can't come and the 6th cell is 39 which is so far. So you can say that 27 is present in the 5th cell where the cell value is 33.

Here L=51 because in 5th row, class limit column lower limit is 51.

Here fmed=8 because 5th row frequency column cell value is.

Here cfp=25 because in Cumulative frequency column 5th cell previous cell value is 25.

Here W=6 because here the class limit gap/difference is 6.

Median=51+[{(54/2)-25}/8]*6=52.5

How to calculate mode?

Mode means that value which comes most of the time in the dataset. Before finding the mode we should arrange the data in ascending order. This is not mandatory but a good practice.
Ex: A,B,A,C,D,B,A
Here we have A there times so A is mode.

How to calculate mode for group data?

Formula: Mode=Lmo+{d1/(d1+d2)}*W

W=class limit column values gap/difference
d1=Frequency column, largest number cell value - previous cell value
d2=Frequency column, largest number cell value - Next cell value
Lmo=Lower limit of class limit for frequency column largest value cell

EX:
Records of ages of 54 people :54,34,45,56,67,81,45,23,34,56,
76,54,32,82,84,34,56,45,76,34,21,23,85,27,76,53,44,37,82,56,78,57,41,
83,48,46,58,59,63,64,76,58,42,41,35,65,45,78,70,60,50,40,35,50


Class Limit Boundaries Frequency(f)
23-29 22.5-29-5 4
30-36 29.5-36.5 7
37-43 36.5-43.5 4
44-50 43.5-50.5 10
51-57 50.5-57.5 8
58-64 57.5-64.5 6
65-71 64.5-71.5 3
72-78 71.5-78.5 6
79-85 78.5-85.5 6

Here largest value of frequency column is 10.
So,
Lmo=44
d1=6
d2=2
W=6
Mode=44+{6/(6+2)}*6=48.5

Statistics Percentiles

If you have some numbers and you want to find a particular number position among those numbers then you will use percentile. Percentiles divide a group of data into 100 parts.
For example, the 80th percentile indicates that at most 80% of the data lies below it and at least 20% of the data lies above it. For 95th percentile indicates that at most 95% of the data lies below it and at least 5% of the data lies above it.

Percentage and percentile are not same. For example, if a student gets 80 out of 100 then you can say that the student gets an 80% number but you don't know the position for his class. Suppose percentile is 90th and percentage is 80% of that student. It means that the student performed better than 90% in his class and performed very well because he gets 80% marks.

Formula: Percentile={(number of values below X+0.5)/total number of values}*100

Example:1
The score of 5 students are given below.
90,61,80,77,85
Find the percentile of 85.
Arrange in ascending order= 61,77,80,85,90
X=85.
Number of value below X=3
total number of value=5
So,
Percentile={(3+0.5)/5}*100=70th percentile
So you can say that student who did score 85, did better than 70%


How to find a percentile value using percentile?

Step 1:
Arrange the data in ascending order.

Step 2:
Put the value in the formula: c=(n*p)/100
Here,
n=total number of values
p=percentile

Step 3:
If the value of c is not a whole number then round up to the next number. Now find that number position in the arranged data. That value will be the percentile value.

or,

If the value of c is a whole number then use the value halfway between the cth and (c+1)st value when counting up from the lowest value.

Example:1
The marks of 5 students are given below.
90, 61, 80, 77, 85.
Calculate the value corresponding to 70th percentile.

Arrange in ascending order= 61, 77, 80, 85, 90
n=5
p=70th
c=(5*70)/100=3.5=4
Now go there where you arrange the data in ascending order. There get the 4th value(because c=4). Here 4th value is 85. So the 70th percentile number is 85.


Example:2
The marks of 5 students are given below.
90, 60, 80, 70, 50
Calculate the value corresponding to 70th percentile.

Arrange in ascending order= 50,60,75,80,90
n=5
p=60th
c=(5*60)/100=3

Here c is whole number
In the arrange data 3rd position data is 70 and 4th position(c+1) data is 80.
So,
70+80=75
So to get 60th percentile we need 75 marks.

Decile

Decile divide the data into 10 groups and the groups are denoted by D1, D2,D3,D4 and so on.

Formula: D=(k/10)*(n+1)
Here,
k=number of decile which you want to find.
n=number of observation

Example:
Find decile D2 and D4 of the following student.
33, 44, 82, 50, 70, 90, 45, 72

Arrange the data in ascending order: 33,44,45,50,70,72,82,90
for D2:
D=(2/10)*(8+1)=1.8=2
D2 is the 2nd element and that is 44

for D4:
D=(4/10)*(8+1)=3.6=4

D4 is the 2nd element and that is 50

CodersAim is created for learning and training a self learner to become a professional from beginner. While using CodersAim, you agree to have read and accepted our terms of use, privacy policy, Contact Us

© Copyright All rights reserved www.CodersAim.com. Developed by CodersAim.