# Nakafa Learning Content

> For AI agents: use [llms.txt](https://nakafa.com/llms.txt) for the site index. Markdown versions are available by appending `.md` to content URLs or sending `Accept: text/markdown`.

URL: https://nakafa.com/en/subjects/mathematics/statistics-foundations/quartile-data-group
Source: https://raw.githubusercontent.com/nakafaai/nakafa.com/refs/heads/main/packages/contents/material/lesson/mathematics/statistics-foundations/quartile-data-group/en.mdx

Learn quartile calculations for grouped data with interpolation methods. Learn Q1, Q2, Q3 positions using cumulative frequency and class boundaries.

---

## How to Find Quartiles in Grouped Data

For single data, we just sort it and find the middle position. Now, if the data is grouped in a frequency table (like test scores grouped as $$70\text{-}79$$, $$80\text{-}89$$, etc.), the method is slightly different. We don't know the exact value of each data point, only how many data points are in each group (class interval).

Visible text: For single data, we just sort it and find the middle position. Now, if the data is grouped in a frequency table (like test scores grouped as , , etc.), the method is slightly different. We don't know the exact value of each data point, only how many data points are in each group (class interval).

Similar to the [median for grouped data](/en/subjects/mathematics/statistics-foundations/median-mode-group-data), to find quartiles ($$Q_1$$, $$Q_2$$, $$Q_3$$), we also use **interpolation**. Essentially, we "estimate" the quartile's position within the class interval where it falls.

Visible text: Similar to the [median for grouped data](/en/subjects/mathematics/statistics-foundations/median-mode-group-data), to find quartiles (, , ), we also use **interpolation**. Essentially, we "estimate" the quartile's position within the class interval where it falls.

We determine the position of the quartile using this formula:

- The position of $$Q_1$$ is the $$\frac{1}{4}n$$-th data point
- The position of $$Q_2$$ is the $$\frac{2}{4}n$$-th data point (or $$\frac{1}{2}n$$-th)
- The position of $$Q_3$$ is the $$\frac{3}{4}n$$-th data point

Visible text: - The position of is the -th data point
- The position of is the -th data point (or -th)
- The position of is the -th data point

Where $$n$$ is the total number of data points.

Visible text: Where is the total number of data points.

## Steps to Find the Value of Quartiles for Grouped Data

Let's assume we have shoe sales data from Store A in a grouped frequency table format.

### Create a Cumulative Frequency Table

First, we need a frequency table with a cumulative frequency column ($$F_k$$). Cumulative frequency is the sum of frequencies from the first class up to that class. This is important to know which class the quartile falls into.

Visible text: First, we need a frequency table with a cumulative frequency column (). Cumulative frequency is the sum of frequencies from the first class up to that class. This is important to know which class the quartile falls into.

For example, here is the shoe sales table:

| Shoe Size | Frequency ($$f$$) | Cumulative Frequency ($$F_k$$) | Lower Boundary ($$T_b$$) | Upper Boundary ($$T_a$$) | Class Width ($$p$$) |
| :-------: | :---------------------------------: | :----------------------------------------------: | :----------------------------------------: | :----------------------------------------: | :-----------------------------------: |
| $$37\text{-}39$$ | $$2$$  | $$2$$  | $$36.5$$ | $$39.5$$ | $$3$$ |
| $$40\text{-}42$$ | $$11$$ | $$13$$ | $$39.5$$ | $$42.5$$ | $$3$$ |
| $$43\text{-}45$$ | $$10$$ | $$23$$ | $$42.5$$ | $$45.5$$ | $$3$$ |
| $$46\text{-}48$$ | $$5$$  | $$28$$ | $$45.5$$ | $$48.5$$ | $$3$$ |
| $$49\text{-}51$$ | $$2$$  | $$30$$ | $$48.5$$ | $$51.5$$ | $$3$$ |
| **Total** | $$30$$ |                                                  |                                            |                                            |                                       |

Visible text: | Shoe Size | Frequency () | Cumulative Frequency () | Lower Boundary () | Upper Boundary () | Class Width () |
| :-------: | :---------------------------------: | :----------------------------------------------: | :----------------------------------------: | :----------------------------------------: | :-----------------------------------: |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| **Total** | | | | | |

$$\text{Lower boundary} = \text{lower limit} - 0.5$$

$$\text{Upper boundary} = \text{upper limit} + 0.5$$

$$\text{Class width} = \text{upper boundary} - \text{lower boundary}$$

### Determine the Quartile Class Position

First, let's find the position of the data point for the quartile.

Total data ($$n$$) is $$30$$.

Visible text: Total data () is .

- **Position of $$Q_1$$:** the $$\frac{1}{4} \times 30 = 7.5$$-th data point.

  Look at the $$F_k$$ column. Which class contains the $$7.5$$th data point? The first class has $$F_k = 2$$ (not enough). The second class has $$F_k = 13$$ (data points $$3$$ through $$13$$ are here). So, the $$7.5$$th data point is in the $$40\text{-}42$$ class.

- **Position of $$Q_2$$ (Median):** the $$\frac{1}{2} \times 30 = 15$$-th data point.

  Look at $$F_k$$. The $$15$$th data point is in the $$43\text{-}45$$ class (because the previous $$F_k$$ was $$13$$, and this class's $$F_k$$ is $$23$$).

- **Position of $$Q_3$$:** the $$\frac{3}{4} \times 30 = 22.5$$-th data point.

  Look at $$F_k$$. The $$22.5$$th data point is also in the $$43\text{-}45$$ class (because the previous $$F_k$$ was $$13$$, and this class's $$F_k$$ is $$23$$).

Visible text: - **Position of :** the -th data point.

 Look at the column. Which class contains the th data point? The first class has (not enough). The second class has (data points through are here). So, the th data point is in the class.

- **Position of (Median):** the -th data point.

 Look at . The th data point is in the class (because the previous was , and this class's is ).

- **Position of :** the -th data point.

 Look at . The th data point is also in the class (because the previous was , and this class's is ).

### Calculate the Quartile Value using the Interpolation Formula

Once we know the class, we use this formula to find the exact value:

```math
Q_i = T_b + \left( \frac{\frac{i}{4}n - F_{kum}}{f_i} \right) p
```

Where:

- $$Q_i$$ = Value of the i-th Quartile (what we're looking for)
- $$T_b$$ = Lower boundary of the i-th quartile class
- $$n$$ = Total frequency
- $$F_{kum}$$ = Cumulative frequency **BEFORE** the i-th quartile
  class
- $$f_i$$ = Frequency of the i-th quartile class
- $$p$$ = Class width

Visible text: - = Value of the i-th Quartile (what we're looking for)
- = Lower boundary of the i-th quartile class
- = Total frequency
- = Cumulative frequency **BEFORE** the i-th quartile
 class
- = Frequency of the i-th quartile class
- = Class width

## Finding the Shoe Sales Quartile

Let's calculate $$Q_1$$ from the table above.

Visible text: Let's calculate from the table above.

1.  **Position of $$Q_1$$:** $$7.5$$th data point.
2.  **Class of $$Q_1$$:** $$40\text{-}42$$.
3.  **Let's gather the ingredients:**
    - Lower boundary of $$Q_1$$ class ($$T_b$$) is $$39.5$$
    - Total data ($$n$$) is $$30$$
    - Cumulative frequency before $$Q_1$$ class ($$F_{kum}$$) is $$2$$ (see $$F_k$$ for class $$37\text{-}39$$)
    - Frequency of $$Q_1$$ class ($$f_1$$) is $$11$$
    - Class width ($$p$$) is $$3$$
4.  **Plug into the formula:**

    <MathContainer>
      
    
    ```math
    Q_1 = T_b + \left( \frac{\frac{1}{4}n - F_{kum}}{f_1} \right) p
    ```

      
    
    ```math
    Q_1 = 39.5 + \left( \frac{7.5 - 2}{11} \right) 3
    ```

      
    
    ```math
    Q_1 = 39.5 + \left( \frac{5.5}{11} \right) 3
    ```

      
    
    ```math
    Q_1 = 39.5 + (0.5) \times 3
    ```

      
    
    ```math
    Q_1 = 39.5 + 1.5
    ```

      
    
    ```math
    Q_1 = 41
    ```

    </MathContainer>

Visible text: 1. **Position of :** th data point.
2. **Class of :** .
3. **Let's gather the ingredients:**
 - Lower boundary of class () is 
 - Total data () is 
 - Cumulative frequency before class () is (see for class )
 - Frequency of class () is 
 - Class width () is 
4. **Plug into the formula:**

 <MathContainer>
 
 

 
 

 
 

 
 

 
 

 
 

 </MathContainer>

So, the value of $$Q_1$$ is $$41$$. This means about $$25\%$$ of the shoes sold are size $$41$$ or smaller.

Visible text: So, the value of is . This means about of the shoes sold are size or smaller.

## Exercise

Try calculating $$Q_3$$ from the shoe sales data in the table above.

Visible text: Try calculating from the shoe sales data in the table above.

After getting the result, compare it with the method for finding quartiles for single data learned earlier. What's the difference, and why might the results be similar or different?

### Answer Key

1.  **Position of $$Q_3$$:** $$22.5$$th data point.
2.  **Class of $$Q_3$$:** $$43\text{-}45$$.
3.  **Gather the ingredients:**

    - $$T_b = 42.5$$ (lower boundary of $$Q_3$$ class)
    - $$n = 30$$ (total data)
    - $$F_{kum} = 13$$ (see $$F_k$$ for class
      $$40\text{-}42$$)
    - $$f_3 = 10$$ (frequency of $$Q_3$$ class)
    - $$p = 3$$ (class width)

4.  **Plug into the formula:**

    <MathContainer>
      
    
    ```math
    Q_3 = T_b + \left( \frac{\frac{3}{4}n - F_{kum}}{f_3} \right) p
    ```

      
    
    ```math
    Q_3 = 42.5 + \left( \frac{22.5 - 13}{10} \right) 3
    ```

      
    
    ```math
    Q_3 = 42.5 + \left( \frac{9.5}{10} \right) 3
    ```

      
    
    ```math
    Q_3 = 42.5 + (0.95) \times 3
    ```

      
    
    ```math
    Q_3 = 42.5 + 2.85
    ```

      
    
    ```math
    Q_3 = 45.35
    ```

    </MathContainer>

Visible text: 1. **Position of :** th data point.
2. **Class of :** .
3. **Gather the ingredients:**

 - (lower boundary of class)
 - (total data)
 - (see for class
 )
 - (frequency of class)
 - (class width)

4. **Plug into the formula:**

 <MathContainer>
 
 

 
 

 
 

 
 

 
 

 
 

 </MathContainer>

So, the value of $$Q_3$$ is $$45.35$$. This means about $$75\%$$ of the shoes sold are size $$45.35$$ or smaller (or $$25\%$$ are sold in sizes larger than $$45.35$$).

Visible text: So, the value of is . This means about of the shoes sold are size or smaller (or are sold in sizes larger than ).

**Comparison with Single Data:**

Finding quartiles for grouped data uses **interpolation** because we don't know the exact value of each data point, only its range. The result is an estimated quartile value.

For single data, we can directly point to which data point is the quartile (or the average of two data points), so the result is more precise (if the data is indeed single). Quartiles for grouped data provide a good overview for large datasets that have already been grouped.