Grouped data


Grouped data is a statistical term used in data analysis. Raw data can be organized by grouping together similar measurements in a table. This frequency table is also called grouped data.[1]

Example

For example, someone gave a group of students a simple math question, and timed how long it took them to answer it. The numbers are below:

20 25 24 33 13
26 8 19 31 11
16 21 17 11 34
14 15 21 18 17
Table 1: Time taken (in seconds) to answer a simple math question

The smallest amount of time was 8 seconds, and the largest was 34 seconds. One method we could use to analyze the needed time is to group close numbers together. In order to keep the analysis fair, we'll make each group be the same number of seconds. We can then count how many students fell in each group. For example, if we organized scores into 5 second ranges:

Time taken Frequency
5 to 9 seconds 1 student
10 to 14 seconds 4 students
15 to 19 seconds 6 students
20 to 24 seconds 4 students
25 to 29 seconds 2 students
30 to 34 seconds 3 students
Table 2: Frequency distribution of the time taken (in seconds) to answer a simple math question


Another way to group data is to organize the scores data into groups based on their performance. Suppose there are three types of students:

  • Smart (5 to 14 seconds)
  • Normal (15 to 24 seconds)
  • Below average (25 or more seconds)

then the grouped data looks like the following:

Frequency
Smart 5
Normal 10
Below average 5
Table 3: Frequency distribution of the three types of students

Mean of grouped data

An estimate, [math]\displaystyle{ \bar{x} }[/math], of the mean can be calculated from grouped data.

[math]\displaystyle{ \bar{x}=\frac{\sum{f*\,x}}{\sum{f}} . }[/math]
x refers to the mid-point of the class intervals
f is the class frequency.

Note that this estimated mean may be different from the sample mean of the ungrouped data. The mean of the grouped data in the above example can be calculated as follows:

Class Intervals Frequency ( f ) Midpoint ( x ) f*x
5 to 9 seconds 1 7.5 7.5
10 to 14 seconds 4 12.5 50
15 to 19 seconds 6 17.5 105
20 to 24 seconds 4 22.5 90
25 to 29 seconds 2 27.5 55
30 to 34 seconds 3 32.5 97.5
TOTAL 20 405


Therefore, the mean of the grouped data is

[math]\displaystyle{ \bar{x}=\frac{\sum{f*\,x}}{\sum{f}} = \frac{405}{20} = 20.25 }[/math]

Related pages


Notes

  1. Newbold et al., 2009, pages 14 to 17

References

  • Newbold, P., W. Carlson and B. Thorne (2009) Statistics for Business and Economics, Seventh edition, Pearson Education. ISBN 9780135072486.

R