# Nakafa Learning Content

> For AI agents: use [llms.txt](https://nakafa.com/llms.txt) for the site index. Markdown versions are available by appending `.md` to content URLs or sending `Accept: text/markdown`.

URL: https://nakafa.com/en/subjects/mathematics/statistics-foundations/variance-standard-deviation-data-group
Source: https://raw.githubusercontent.com/nakafaai/nakafa.com/refs/heads/main/packages/contents/material/lesson/mathematics/statistics-foundations/variance-standard-deviation-data-group/en.mdx

Learn to calculate variance and standard deviation for grouped frequency data with worked examples. Learn statistical formulas for class intervals.

---

## Calculating Spread for Grouped Data

How do we measure the spread of data presented in a grouped frequency table? For example, data on phone battery duration grouped into hour intervals ($$6\text{-}10 \text{ hours}$$, $$11\text{-}15 \text{ hours}$$, etc.).

Visible text: How do we measure the spread of data presented in a grouped frequency table? For example, data on phone battery duration grouped into hour intervals (, , etc.).

Since we don't know the exact value of each data point within a class interval (e.g., in the $$11\text{-}15 \text{ hours}$$ class, we don't know if the duration was exactly $$11 \text{ hours}$$, $$12 \text{ hours}$$, or something else), we need to make an assumption.

Visible text: Since we don't know the exact value of each data point within a class interval (e.g., in the class, we don't know if the duration was exactly , , or something else), we need to make an assumption.

The most common assumption is that all data within a class interval are evenly distributed. Therefore, we can **represent** all data in that class using the **midpoint** ($$x_i$$) of that class.

Visible text: The most common assumption is that all data within a class interval are evenly distributed. Therefore, we can **represent** all data in that class using the **midpoint** () of that class.

## Formulas for Variance and Standard Deviation of Grouped Data

Using the midpoint ($$x_i$$) and frequency ($$f$$) of each class, the formulas are slightly different:

Visible text: Using the midpoint () and frequency () of each class, the formulas are slightly different:

1.  **Variance ($$\sigma^2$$)**
    The commonly used (and easier to compute) formula is the computational formula adapted for grouped data:

    
    
    ```math
    \sigma^2 = \frac{\sum (f \cdot x_i^2)}{\sum f} - \left( \frac{\sum (f \cdot x_i)}{\sum f} \right)^2
    ```

    This formula essentially calculates the average of the squared midpoints weighted by frequency, minus the square of the average midpoint weighted by frequency (the mean of the grouped data).

2.  **Standard Deviation ($$\sigma$$)**
    Just like with ungrouped data, the standard deviation is the square root of the variance:

    
    
    ```math
    \sigma = \sqrt{\sigma^2}
    ```

Visible text: 1. **Variance ()**
 The commonly used (and easier to compute) formula is the computational formula adapted for grouped data:

 
 

 This formula essentially calculates the average of the squared midpoints weighted by frequency, minus the square of the average midpoint weighted by frequency (the mean of the grouped data).

2. **Standard Deviation ()**
 Just like with ungrouped data, the standard deviation is the square root of the variance:

## Calculating Variance and Standard Deviation of Phone Battery Duration

Suppose a study on phone battery duration yielded the following data:

| Battery duration (hours) | Frequency ($$f$$) |
| :----------------------: | :---------------------------------: |
| $$6\text{-}10$$  | $$2$$  |
| $$11\text{-}15$$ | $$10$$ |
| $$16\text{-}20$$ | $$18$$ |
| $$21\text{-}25$$ | $$45$$ |
| $$26\text{-}30$$ | $$5$$  |

Visible text: | Battery duration (hours) | Frequency () |
| :----------------------: | :---------------------------------: |
| | |
| | |
| | |
| | |
| | |

Let's determine the variance and standard deviation for this battery duration data.

### Create a Helper Table

We need to calculate the midpoint ($$x_i$$) for each class, then compute $$f \cdot x_i$$ and $$f \cdot x_i^2$$.

Visible text: We need to calculate the midpoint () for each class, then compute and .

| Battery duration (hours) |   Midpoint, $$x_i$$    |  Frequency, $$f$$   |     $$f \cdot x_i$$      |      $$f \cdot x_i^2$$      |
| :----------------------: | :--------------------------------------: | :-----------------------------------: | :----------------------------------------: | :-------------------------------------------: |
| $$6\text{-}10$$  |  $$\frac{6+10}{2}=8$$  | $$2$$  |   $$2 \times 8 = 16$$    |   $$2 \times 8^2 = 128$$    |
| $$11\text{-}15$$ | $$\frac{11+15}{2}=13$$ | $$10$$ |  $$10 \times 13 = 130$$  |  $$10 \times 13^2 = 1690$$  |
| $$16\text{-}20$$ | $$\frac{16+20}{2}=18$$ | $$18$$ |  $$18 \times 18 = 324$$  |  $$18 \times 18^2 = 5832$$  |
| $$21\text{-}25$$ | $$\frac{21+25}{2}=23$$ | $$45$$ | $$45 \times 23 = 1035$$  | $$45 \times 23^2 = 23805$$  |
| $$26\text{-}30$$ | $$\frac{26+30}{2}=28$$ | $$5$$  |  $$5 \times 28 = 140$$   |  $$5 \times 28^2 = 3920$$   |
|        **Total**         |                                          | **$$\sum f = 80$$** | **$$\sum fx_i = 1645$$** | **$$\sum fx_i^2 = 35375$$** |

Visible text: | Battery duration (hours) | Midpoint, | Frequency, | | |
| :----------------------: | :--------------------------------------: | :-----------------------------------: | :----------------------------------------: | :-------------------------------------------: |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| **Total** | | **** | **** | **** |

### Calculate Variance

Plug the total values from the table into the variance formula:

Component: MathContainer
Children:

```math
\sigma^2 = \frac{\sum (f \cdot x_i^2)}{\sum f} - \left( \frac{\sum (f \cdot x_i)}{\sum f} \right)^2
```

```math
\sigma^2 = \frac{35375}{80} - \left( \frac{1645}{80} \right)^2
```

```math
\sigma^2 = 442.1875 - (20.5625)^2
```

```math
\sigma^2 = 442.1875 - 422.81640625
```

```math
\sigma^2 \approx 19.37
```

So, the variance of the battery duration data is approximately $$19.37$$ (in units of hours squared).

Visible text: So, the variance of the battery duration data is approximately (in units of hours squared).

### Calculate Standard Deviation

Take the square root of the variance:

```math
\sigma = \sqrt{19.37} \approx 4.4
```

The standard deviation of the phone battery duration is approximately $$4.4 \text{ hours}$$. This gives us an idea that the average deviation of battery duration from the mean (which can be calculated as $$\frac{1645}{80} \approx 20.56 \text{ hours}$$) is about $$4.4 \text{ hours}$$.

Visible text: The standard deviation of the phone battery duration is approximately . This gives us an idea that the average deviation of battery duration from the mean (which can be calculated as ) is about .

The smaller the standard deviation, the more uniform the phone battery durations were in the study.