Friday, March 19, 2010

Standard deviation is not the same as range of variation

Often research results are expressed in means and standard deviations. For example, in the study reviewed in this post, it is stated that the weight of participants in a 12-week weight loss study varied from: 87.9 plus or minus 15.4 kg (at baseline, or before the 12-week intervention) to 81.7 plus or minus 16.2 kg (after the 12-week intervention).

The 87.9 and 81.7 are the average weights (a.k.a. “mean” weights), in kilograms, before and after the 12-week intervention. However, the 15.4 and 16.2 are NOT the range of variation in weights around the means before and after 12-week intervention. They are actually the ranges around the means encompassing approximately 68 percent of all of the values measured (see figure below, from www.electrical-res.com).

In the figure above, the minus and plus 15.4 and 16.2 values would be the “mean(x) – s” and “mean(x) + s” points on the horizontal axis of histograms of weights plotted before and after the 12-week intervention. This assumes that the distributions of weights are normal, or quasi-normal (i.e., similar to a bell-shaped, or normal, curve); a common assumption in this type of research.

The larger the standard deviation, the wider is the variation in the measures, and the flatter is the associated histogram (the bell-shaped curve). This property has a number of interesting implications, some of which will be discussed in other posts.

Sometimes another measure of dispersion, the variance, is reported instead of the standard deviation. The variance is the standard deviation squared.

The reason why standard deviations are reported instead of ranges of variation is that outliers (unusually high or low values) can dramatically widen the ranges. The standard deviation coefficient is much less sensitive to outliers.