Source codeVideos

Command Palette

Search for a command to run...

Statistics

Interquartile Range

What Is the Interquartile Range (IQR)?

We already know how to find the mean of data. But sometimes, the mean alone isn't enough to describe the data. Imagine two groups of friends with the same average age, but when you look closer, the age spread within each group is very different.

This is where the Interquartile Range (IQR) comes in handy! The IQR is a measure of data spread that focuses on the middle 50% of the data after it has been sorted.

Why focus on the middle? Sometimes data has values at the extremes (too small or too large, called outliers) that can make other spread measures (like the Range) less accurate. The IQR is more "resistant" to these extreme values.

The IQR formula is super simple:

IQR=Q3Q1IQR = Q_3 - Q_1
  • Q3Q_3 is the Upper Quartile (the value marking the bottom 75% of the data).
  • Q1Q_1 is the Lower Quartile (the value marking the bottom 25% of the data).

So, the IQR is the difference between the upper and lower quartiles.

Comparing Age Spreads

To make it clearer, let's look at an example.

There are two groups, each with 12 people. Let's look at their age data in the table below:

Data Point #Group OneGroup Two
1131
2143
3154
4155
5167
6168
71712
81727
91728
101729
111732
121836

Let's calculate some statistics for these two groups.

Calculating Mean, Q1, and Q3

  1. Mean (Average):

    If you calculate the average age for both groups, the result is exactly the same, 16 years old.

    Group 1:

    13+14+15+15+16+16+17+17+17+17+17+1812=19212=16\frac{13+14+15+15+16+16+17+17+17+17+17+18}{12} = \frac{192}{12} = 16

    Group 2:

    1+3+4+5+7+8+12+27+28+29+32+3612=19212=16\frac{1+3+4+5+7+8+12+27+28+29+32+36}{12} = \frac{192}{12} = 16
  2. Quartiles (Q1Q_1 and Q3Q_3):

    After sorting the data (as in the table above), we find the positions of the quartiles.

    Group One (n=12n=12): The position of Q1Q_1 is at data point 14(12+1)=3.25\frac{1}{4}(12+1) = 3.25. This means Q1Q_1 is between the 3rd (15) and 4th (15) data points. Since both values are the same, then

    Q1=15Q_1 = 15

    The position of Q3Q_3 is at data point 34(12+1)=9.75\frac{3}{4}(12+1) = 9.75. This means Q3Q_3 is between the 9th (17) and 10th (17) data points. Since both values are the same, then

    Q3=17Q_3 = 17

    Group Two (n=12n=12): The position of Q1Q_1 is at data point 3.25. This means Q1Q_1 is between the 3rd (4) and 4th (5) data points. In this case, we can take the average of these two data points:

    Q1=4+52=4.5Q_1 = \frac{4+5}{2} = 4.5

    The position of Q3Q_3 is at data point 9.75. This means Q3Q_3 is between the 9th (28) and 10th (29) data points. Similarly, we take their average:

    Q3=28+292=28.5Q_3 = \frac{28+29}{2} = 28.5

Calculating Range and IQR

Now let's calculate the measures of spread.

  1. Range:

    Range = Maximum Value - Minimum Value

    • Range Group One = 18 - 13 = 5

    • Range Group Two = 36 - 1 = 35

      Wow, the ranges are very different! Group Two's data is much more spread out when looking at the extreme values.

  2. Interquartile Range (IQR):

    IQR = Q3Q1Q_3 - Q_1

    • IQR Group One = 17 - 15 = 2

    • IQR Group Two = 28.5 - 4.5 = 24

      The IQRs are also very different!

Interpreting the Results

Let's summarize in a table for easy comparison:

GroupMeanQ1Q_1Q3Q_3RangeInterquartile Range (IQR)
One16151752
Two164.528.53524
  • Both groups have the same Mean (average) age, which is 16.
  • But their Range and IQR are very different.
  • Group One has a small Range (5) and a very small IQR (2). This means the ages of people in Group One are very close together, especially the middle 50%, whose ages only differ by 2 years (Q1=15,Q3=17Q_1=15, Q_3=17). The data is clustered around the mean.
  • Group Two has a large Range (35) and a large IQR (24). This means the ages of people in Group Two are much more spread out. Even the middle 50% spans 24 years (Q1=4.5,Q3=28.5Q_1=4.5, Q_3=28.5). The data is not as clustered as Group One.

Even if the average is the same, the spread of the data can be very different. IQR helps us see how spread out the middle part of the data is, giving a better picture of data variation than just looking at the mean or range, especially when there are extreme values.