Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
The measure of central tendency is a single value that attempts to describe data by identifying the central position of the data. Sometimes, measures of central tendency are also known as measures of central location. They are also known as summary statistics. It is a measurement of central location.
There are three main measures of central tendencies,i.e., mean, median, and mode, which we can calculate using the pandas' python library methods.
Mean
There are three ways to calculate the mean of the dataset. The ways are
Arithmetic Mean
It is the simplest of all, and simply it is the sum of observations in the dataset divided by the number of observations.
Harmonic Mean is the reciprocal of the arithmetic mean,i.e., reciprocal of all observations in a data series. It is generally preferred when giving higher weightage to lower magnitude data observations.
Represents the middle value of a data series, sorted in ascending or descending order. In the case of an odd count of observations(n+1)/2 th observation is the median. While in the case of even count of the observation, the average of n/2 and (n/2)+1 th observation is chosen.
While if the distribution is in the form of continuous or discrete data, then we calculate the cumulative Frequency like below:
Random variable(x)
Frequency of x
Cumulative frequency
1
10
10
2
12
22
3
15
37
4
16
53
5
18
71
6
23
94
7
28
122
Total
122
Here the total number of observations is 122, which is even. So, the median would be the N/2 = 61st observation. By looking at the cumulative Frequency, we can determine the 61st observation. In the table above, which lists discrete observations, we expect the 61st observation to be 5. Why? There are 53 observations which are ≤ 4. The following 18 observations are valued at 5. So, given 53 + 8 = 61, it means that the 8th observation after the last four should be five and hence the median.
While if the observations are divided in class, find out the range where median like we just did in discrete data,i.e., the class of the median, then we can find median use the below formula,
One of the most straightforward measures represents the observations with maximum frequency.
For example, suppose we have series like 2, 3, 4, 4, 4, 4, 5, 7, the mode of this series will be four because it has a maximum frequency,i.e., 4. Suppose we have a continuous distribution like below:
Age Range
Count
25-30
29
30-35
43
35-40
22
40-45
41
The age group, 30-35, has a maximum count of 43, which will be our model class. Now to calculate the mode, use the below formula:
What is the best measurement of central tendency? Well, there's no best for measure central tendency. It solely depends on the data we are feeding, and different measures have their own positive and negative points depending on the data taken.
When is the mean considered the best measure of central tendency? When our data distribution is continuous and symmetrical, we have a normal distribution. However, it all depends on what you are trying to show from your data.
Which is greatest in a normally distributed data set: mode, median, or mean? The mean, median, and mean are equal if the data set is normally distributed without any skewness.
Key Takeaways
Let us brief the article.
Firstly, we saw different central tendencies applied to numerical data and their basic implementation. Lastly, we saw which central tendencies work best on different data types.
Central tendencies are the foundation for all advanced statistical work and help interpret the data for better results.
I hope you like the article. Keep updated for more exciting articles.
Happy Learning Ninjas!
Live masterclass
Become a YouTube Analyst: Use Python to analyze viewers data
by Coding Ninjas
04 Feb, 2025
02:30 PM
Get hired as an Amazon SDE : Resume building tips
by Coding Ninjas
03 Feb, 2025
02:30 PM
Expert tips: Ace Leadership roles in Fortune 500 companies
by Coding Ninjas
03 Feb, 2025
12:30 PM
Become a YouTube Analyst: Use Python to analyze viewers data