# Nakafa Learning Content

> For AI agents: use [llms.txt](https://nakafa.com/llms.txt) for the site index. Markdown versions are available by appending `.md` to content URLs or sending `Accept: text/markdown`.

URL: https://nakafa.com/en/subjects/mathematics/statistics-foundations/percentile-data-group
Source: https://raw.githubusercontent.com/nakafaai/nakafa.com/refs/heads/main/packages/contents/material/lesson/mathematics/statistics-foundations/percentile-data-group/en.mdx

Calculate percentiles in grouped data using interpolation formulas. Learn to find data positions and interpret percentile rankings with examples.

---

## What Are Percentiles?

You're already familiar with [quartiles](/en/subjects/mathematics/statistics-foundations/quartile-data-group), which divide data into $$4$$ equal parts, right? Well, **percentiles** are like quartiles' sibling, but they're even more detailed!

Visible text: You're already familiar with [quartiles](/en/subjects/mathematics/statistics-foundations/quartile-data-group), which divide data into equal parts, right? Well, **percentiles** are like quartiles' sibling, but they're even more detailed!

If quartiles divide data into $$4$$ chunks, percentiles divide ordered data into $$100$$ equal chunks. That's a lot, huh? Like dividing a chocolate bar into $$100$$ tiny squares.

Visible text: If quartiles divide data into chunks, percentiles divide ordered data into equal chunks. That's a lot, huh? Like dividing a chocolate bar into tiny squares.

Each chunk is separated by a percentile value. There are $$99$$ percentile values, starting from $$P_1$$, $$P_2$$, $$P_3$$, ..., up to $$P_{99}$$.

Visible text: Each chunk is separated by a percentile value. There are percentile values, starting from , , , ..., up to .

- $$P_{10}$$ ($$10$$th Percentile) means this value separates the smallest
  $$10\%$$ of the data from the remaining $$90\%$$.
- $$P_{50}$$ ($$50$$th Percentile) is exactly the same as the **Median**
  or the **Second Quartile ($$Q_2$$
  )**, because it divides the data right in the middle ($$50\%$$ below, $$50\%$$ above).
- $$P_{85}$$ ($$85$$th Percentile) means this value separates the smallest
  $$85\%$$ of the data from the largest $$15\%$$.

Visible text: - (th Percentile) means this value separates the smallest
 of the data from the remaining .
- (th Percentile) is exactly the same as the **Median**
 or the **Second Quartile (
 )**, because it divides the data right in the middle ( below, above).
- (th Percentile) means this value separates the smallest
 of the data from the largest .

Percentiles are very useful for seeing the position of a specific value relative to the entire dataset, like test-score rankings in one group or a child's growth compared to peers of the same age.

## How to Find Percentile Values for Grouped Data

Just like finding quartiles for grouped data, we also use **interpolation** to find the value of a percentile ($$P_i$$) when the data is grouped.

Visible text: Just like finding quartiles for grouped data, we also use **interpolation** to find the value of a percentile () when the data is grouped.

The steps are very similar:

### Find the Percentile Class Position

First, we determine which data point corresponds to the i-th percentile. The formula is:

```math
\text{Position of } P_i = \text{the } \frac{i}{100} \times n \text{-th data point}
```

- $$i$$ = Which percentile are we looking for? (e.g., $$10, 50, 85$$)
- $$n$$ = Total number of data points

Visible text: - = Which percentile are we looking for? (e.g., )
- = Total number of data points

Once we have the position, we look at the cumulative frequency table ($$F_k$$) to find out which class interval this percentile falls into.

Visible text: Once we have the position, we look at the cumulative frequency table () to find out which class interval this percentile falls into.

### Calculate the Percentile Value using the Interpolation Formula

Once we know the class, we use this magic interpolation formula:

```math
P_i = T_b + \left( \frac{\frac{i}{100}n - F_{kum}}{f_i} \right) p
```

Where:

- $$P_i$$ = Value of the i-th Percentile (what we're looking for)
- $$T_b$$ = Lower boundary of the i-th percentile class
- $$i$$ = Which percentile (e.g., $$10, 85$$)
- $$n$$ = Total frequency
- $$F_{kum}$$ = Cumulative frequency **BEFORE** the i-th percentile
  class
- $$f_i$$ = Frequency of the i-th percentile class
- $$p$$ = Class width

Visible text: - = Value of the i-th Percentile (what we're looking for)
- = Lower boundary of the i-th percentile class
- = Which percentile (e.g., )
- = Total frequency
- = Cumulative frequency **BEFORE** the i-th percentile
 class
- = Frequency of the i-th percentile class
- = Class width

Notice, the formula is very similar to the quartile formula, the only difference is the $$\frac{i}{100}n$$ part (quartiles use $$\frac{i}{4}n$$).

Visible text: Notice, the formula is very similar to the quartile formula, the only difference is the part (quartiles use ).

## Finding Math Test Scores

For example, let's say we have the math test scores of $$40 \text{ students}$$:

Visible text: For example, let's say we have the math test scores of :

| Test Score | Frequency ($$f$$) | Cumulative Frequency ($$F_k$$) | Lower Boundary ($$T_b$$) | Class Width ($$p$$) |
| :--------: | :---------------------------------: | :----------------------------------------------: | :----------------------------------------: | :-----------------------------------: |
| $$61\text{-}70$$  | $$4$$  | $$4$$  | $$60.5$$ | $$10$$ |
| $$71\text{-}80$$  | $$10$$ | $$14$$ | $$70.5$$ | $$10$$ |
| $$81\text{-}90$$  | $$16$$ | $$30$$ | $$80.5$$ | $$10$$ |
| $$91\text{-}100$$ | $$10$$ | $$40$$ | $$90.5$$ | $$10$$ |
| **Total**  | $$40$$ |                                                  |                                            |                                       |

Visible text: | Test Score | Frequency () | Cumulative Frequency () | Lower Boundary () | Class Width () |
| :--------: | :---------------------------------: | :----------------------------------------------: | :----------------------------------------: | :-----------------------------------: |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| **Total** | | | | |

We want to find the value of the $$85$$th Percentile ($$P_{85}$$).

Visible text: We want to find the value of the th Percentile ().

1.  **Find the Position of $$P_{85}$$:**

    The position of $$P_{85}$$ is the $$\frac{85}{100} \times 40 = \frac{3400}{100} = 34$$-th data point.

2.  **Determine the Class of $$P_{85}$$:**

    Look at the $$F_k$$ column. Where is the $$34$$th data point? The $$81\text{-}90$$ class has $$F_k = 30$$ (not enough). The $$91\text{-}100$$ class has $$F_k = 40$$ (data points $$31$$ through $$40$$ are here). So, the $$P_{85}$$ class is $$91\text{-}100$$.

3.  **Gather Ingredients for the Formula:**

    - $$T_b$$ (Lower boundary of class $$91\text{-}100$$) is $$90.5$$
    - $$i = 85$$
    - $$n = 40$$
    - $$F_{kum}$$ (Cumulative frequency before class $$91\text{-}100$$) =
      $$30$$
    - $$f_{85}$$ (Frequency of class $$91\text{-}100$$) is $$10$$
    - $$p$$ (Class width) is $$10$$

4.  **Calculate $$P_{85}$$:**

    <MathContainer>
      
    
    ```math
    P_{85} = T_b + \left( \frac{\frac{85}{100}n - F_{kum}}{f_{85}} \right) p
    ```

      
    
    ```math
    P_{85} = 90.5 + \left( \frac{34 - 30}{10} \right) 10
    ```

      
    
    ```math
    P_{85} = 90.5 + \left( \frac{4}{10} \right) 10
    ```

      
    
    ```math
    P_{85} = 90.5 + 4
    ```

      
    
    ```math
    P_{85} = 94.5
    ```

    </MathContainer>

Visible text: 1. **Find the Position of :**

 The position of is the -th data point.

2. **Determine the Class of :**

 Look at the column. Where is the th data point? The class has (not enough). The class has (data points through are here). So, the class is .

3. **Gather Ingredients for the Formula:**

 - (Lower boundary of class ) is 
 - 
 - 
 - (Cumulative frequency before class ) =
 
 - (Frequency of class ) is 
 - (Class width) is 

4. **Calculate :**

 <MathContainer>
 
 

 
 

 
 

 
 

 
 

 </MathContainer>

So, the $$85$$th Percentile value is $$94.5$$. This means $$85\%$$ of the students scored $$94.5$$ or less, and $$15\%$$ scored above $$94.5$$.

Visible text: So, the th Percentile value is . This means of the students scored or less, and scored above .

## Exercise

Try calculating the value of the $$20$$th Percentile ($$P_{20}$$) from the math test score data above!

Visible text: Try calculating the value of the th Percentile () from the math test score data above!

### Answer Key

1.  **Position of $$P_{20}$$:**

    The position of $$P_{20}$$ is the $$\frac{20}{100} \times 40 = \frac{800}{100} = 8$$-th data point.

2.  **Class of $$P_{20}$$:**

    Look at $$F_k$$. The $$8$$th data point is in the $$71\text{-}80$$ class (because the previous class's $$F_k$$ is $$4$$, and this class's $$F_k$$ is $$14$$).

3.  **Formula Ingredients:**

    - $$T_b = 70.5$$
    - $$i = 20$$
    - $$n = 40$$
    - $$F_{kum} = 4$$
    - $$f_{20} = 10$$
    - $$p = 10$$

4.  **Calculate $$P_{20}$$:**

    <MathContainer>
      
    
    ```math
    P_{20} = T_b + \left( \frac{\frac{20}{100}n - F_{kum}}{f_{20}} \right) p
    ```

      
    
    ```math
    P_{20} = 70.5 + \left( \frac{8 - 4}{10} \right) 10
    ```

      
    
    ```math
    P_{20} = 70.5 + \left( \frac{4}{10} \right) 10
    ```

      
    
    ```math
    P_{20} = 70.5 + 4
    ```

      
    
    ```math
    P_{20} = 74.5
    ```

    </MathContainer>

Visible text: 1. **Position of :**

 The position of is the -th data point.

2. **Class of :**

 Look at . The th data point is in the class (because the previous class's is , and this class's is ).

3. **Formula Ingredients:**

 - 
 - 
 - 
 - 
 - 
 - 

4. **Calculate :**

 <MathContainer>
 
 

 
 

 
 

 
 

 
 

 </MathContainer>

The $$20$$th Percentile value is $$74.5$$.

Visible text: The th Percentile value is .