Thursday, February 12, 2009

Measuring dispersion in Sample Data

Various tools can be used in measuring dispersion in sample data, because it is unlikely that any sample will contain the absolute lowest and highest value on the population it can tend to underestimate actual dispersion.

The formula to calculate sample variance is similar to variance and is written as:

s2 = 1/(n-1) * sum(X(values)-mean))2

The only difference is that we divide by n-1 instead of n, because sample variance tends to be an underestimate.

Other tools to measure sample variance are quartiles or percentiles.

The median can be thought as the 50th percentile, since 50 of the values fall both above and below it. Thus with the 75th percentile, 75% of the values fall below it and 25% are above it, and so on. These simple percentiles can give a good estimate of dispersion in the sample.

No comments: