Weighted average
A weighted average is the average of values which are scaled by importance. The weighted average equals the sum of weights times values divided by the sum of the weights.
Simple case
This collection of four integers is called a multiset because the integer [math]\displaystyle{ 3 }[/math] appears more than once:
- [math]\displaystyle{ \{1,3,3,5\} }[/math]
The straight average (or arithmetic mean) of these four variables is the sum divided by [math]\displaystyle{ 4 }[/math]:
- [math]\displaystyle{ \text{arithmetic mean}=\frac{1+3+3+5}{4}=3=\overline x }[/math]
In the above equation, we used the overline to express the fact that this is the average over four variables. These variables can also be expressed using subscripts:
- [math]\displaystyle{ x_j=\{x_1,x_2,x_3,x_4\}=\{1,3,3,5\} }[/math]
You get the same answer by calculating this as a weighted average of three variables. But you need to include a weight-value of [math]\displaystyle{ 2 }[/math] for the integer [math]\displaystyle{ 3 }[/math]:
- [math]\displaystyle{ \overline x=\frac{(\widehat{\,1}\times1)+(\widehat{\,2}\times3)+(\widehat{\,1}\times5)}{4}=3 }[/math]
Here the hat [math]\displaystyle{ \widehat\text{...} }[/math] is used to mark weight values, which represent how many times each of the three values [math]\displaystyle{ (1,3,5) }[/math] appear. While it is not common to put hats over numbers in calculations, this [math]\displaystyle{ (\widehat{\,1},\widehat{\,2},\widehat{\,1}) }[/math] representation allows us to see how the weights influence the calculation. We can also write the weighted average as:
- [math]\displaystyle{ \overline x=\frac{(\widehat{\,1}\times1)+(\widehat{\,2}\times3)+(\widehat{\,1}\times5)}{\widehat{\,1}+\widehat{\,2}+\widehat{\,1}} }[/math]
In conclusion, we have shown how an arithmetic mean over four variables (1, 3, 3, 5) can be understood as a weighted average over only three variables (1, 3, 5). This is accomplished by multiplying the repeated variable [math]\displaystyle{ 3 }[/math] by the number of times it is repeated (two).
Summation notation
To understand how to express this using summation notation, we place a tilde [math]\displaystyle{ \widetilde\text{...} }[/math] over variables that involve the weighted average. First we define:
- [math]\displaystyle{ \begin{array}{ll} x_j = \{1,3,3,5\}&\!\!\text{with } N= 4\text{ terms}\\ \widetilde{x_k} = \{1,3,5\}&\!\!\text{with } \widetilde{N} = 3\text{ terms}\\ w_k = \{1,2,1\}&\!\!\text{as } 3\text{ weights} \end{array} }[/math]
Note that the sum of the three weights equals [math]\displaystyle{ 4 }[/math]:
- [math]\displaystyle{ \sum_{k=1}^{\widetilde N}w_k=\sum_{k=1}^{3}w_k=1+2+1=N=4 }[/math]
The ordinary average (with four terms) equals the weighted average (with three terms):
- [math]\displaystyle{ \overline x=\frac 1N{\sum_{j=1}^{N}x_j}=\frac1\widetilde{N_j}{\sum_{j=1}^{\widetilde N}w_j\cdot\widetilde{x_j}} }[/math]
Simplify this by replacing [math]\displaystyle{ \widetilde N }[/math] with the sum over all the weights:
[math]\displaystyle{ \overline x=\frac{\sum_jw_jx_j}{\sum_jw_j} }[/math]
Application to exam scores
Given two school classes, one with 20 students, and one with 30 students, the grades in each class on a test were:
- Morning class = 62, 67, 71, 74, 76, 77, 78, 79, 79, 80, 80, 81, 81, 82, 83, 84, 86, 89, 93, 98
- Afternoon class = 81, 82, 83, 84, 85, 86, 87, 87, 88, 88, 89, 89, 89, 90, 90, 90, 90, 91, 91, 91, 92, 92, 93, 93, 94, 95, 96, 97, 98, 99
The straight average for the morning class is 80 and the straight average of the afternoon class is 90. The straight average of 80 and 90 is 85, the mean of the two class means. But this does not account for the difference in number of students in each class, so the value of 85 does not reflect the average student grade (independent of class). The average student grade can be calculated by averaging all the grades, without regard to classes (add all the grades up and divide by the total number of students):
- [math]\displaystyle{ \bar x=\frac{4300}{50}=86. }[/math]
Or, this can be accomplished by weighting the class means by the number of students in each class (using a weighted mean of the class means):
- [math]\displaystyle{ \bar x=\frac{20\times80+30\times90}{20+30}=86. }[/math]
Thus, the weighted mean makes it possible to find the average student grade in the case where only the class means and the number of students in each class are available.