Measures of Central Tendency only provides a partial description of datasets. For example, you may have scored higher than the average on a test, but how good did you do in comparison to the others? If the data is closely clustered, then you probably did pretty well, but if the data is spread out, it probably means there are several really high scores. Thus, we need to complete the description with the measure of variability, or spread.
The range of a quantitative data set is equal to the largest measurement minus the smallest measurement. |
Another measurement is to calculate the average deviations away from the mean. This can be measured through calculations of the variance and standard deviation.
Remember it is usually not possible to collect data for everyone in the population, so we use the sample measurements to inference the population measurements. Statistics measurements of variability are denoted by the following symbols:
Sample | Population |
Sample Variance \(s^2\) Sample Standard Deviation \(s\) |
Population Variance \(\sigma^2\) Population Standard Deviation \(\sigma\) |
The sample variance for a sample of n measurements is equal to the sum of the squared deviations from the mean, divided by the degrees of freedom \((n-1)\). \[s^2=\frac{\sum\limits_{i=1}^{n} \left(x_i - \bar{x}\right)^2}{n-1}\] The formula can also be simplified to \[s^2 = \frac{ \sum\limits_{i=1}^n x_{i}^{2} - \frac{ \left(\sum\limits_{i=1}^{n} x_i \right)^2 }{n}}{n-1 }\] |
The sample standard deviation, \(s\), is the positive square root of the sample variance,\(s^2\), \[s=\sqrt{s^2}\] |
Example,
For the data set \(1,2,3,4,5\), find the variance. First the sample mean is \(\bar{x} =3\)