What is Mean?
Besides Median (middle value) and Mode (most frequent value), we have another important way to see the "center" of data, which is the Mean or Average.
The Mean is the value we get if we distribute the total sum of all data evenly among all data members. Imagine you have several candies in different amounts, then you collect them all and divide them equally among your friends, that's the concept of Mean.
Calculating the Mean
Calculating the Mean is very simple:
Mean Formula:
Where:
- (read "x bar") is the symbol for Mean.
- (read "sigma x") means the Total sum of all data values (x).
- is the number of data points.
Case Study
Let's look at an example to make it clearer.
OSIS Social Action
Initial Situation:
The Student Council (OSIS) of School A () collected usable used clothes. The number of clothes collected by each member is:
Finding the Initial Mean, Median, and Mode:
-
Sort the data (for Median & Mode):
(there are , )
-
Calculate the Mean:
So, the initial mean is .
-
Find the Median:
The number of data points is even (). The middle data are at positions and .
The value at position is , and the value at position is .
So, the initial median is .
-
Find the Mode:
The most frequent values are () and ().
So, the initial modes are and (bimodal).
New Situation:
The next day, joined and donated and items of clothing.
The new data becomes:
Finding the New Mean, Median, and Mode:
-
Sort the data:
(there are , )
-
Calculate the New Mean:
So, the new mean is .
-
Find the New Median:
The number of data points is even (). The middle data are at positions and .
The value at position is , the value at position is .
So, the new median .
-
Find the New Mode:
The most frequent values are still () and ().
So, the new mode .
Impact of Adding Data
Let's observe the changes in the measures of central tendency from the initial to the new situation:
- Mean: Changed from to (increased by )
- Median: Changed from to (increased by )
- Mode: Remained and (unchanged)
What can we conclude? Adding two new data points ( and ), whose values are quite far from the initial data, had the most significant impact on the Mean. The Mean value was "pulled" by the larger new data values.
The Median also changed, but its change was not as large as the Mean's. The Mode didn't change at all.
Impact of Extreme Data (Outliers)
What if one of the new data points was very extreme? For example, the th student donated , not .
Data becomes: ()
-
Calculate the Extreme Mean:
The mean becomes . Very far from the initial mean () or the previous new mean ().
-
Find the Extreme Median:
Sorted data:
Median is still . (Same as the previous new situation)
-
Find the Extreme Mode:
Mode remains and .
Very extreme data (called outliers) greatly affect the Mean, but hardly affect the Median and Mode. This is the weakness of the Mean; it is sensitive to outliers. Median and Mode are more stable when outliers appear.
So, the Mean is a simple arithmetic average, but we need to be careful if there are data points with values very different from the rest, as they can make the Mean less representative.