Distribution of Measured Values

From CheLabWiki

Jump to: navigation, search

…from CheLabWiki, an online resource for chemical-engineering laboratories located at www.chelabwiki.org; Site Revision #320; 6 January 2009.


Figure 1. Standard deviation s measures the width of a Gaussian distribution.
Figure 1. Standard deviation s measures the width of a Gaussian distribution.

To explore the range over which a measured value is distributed, we repeat measurements. When fluctuations in measurements are dominated by random events that are mutually independent, repeated values measured for y describe a Gaussian distribution about their mean value ym:


(1)
f(y) \ = \ \frac {1}{s\sqrt{2\pi}} \exp \left [ \frac {-(y - y_m)^2}{2s^2} \right]


This is sometimes called the normal distribution, because it is the one that normally occurs. In (1) the quantity s is called the standard deviation; it measures the width of the distribution f(y), as in Figure 1. For N repeated measurements, yi (i = 1, 2, ... , N), it is computed by


(2)
s \ = \ \frac {1}{\sqrt{N-1}} \sqrt{ \sum_i^N (y_i - y_m)^2 }


Physically, the sum of the squares of the deviations in (2) is proportional to the "noise" in the N measurements, while N is proportional to the "strength" of the signal represented by the average ym; therefore, the standard deviation in (2) is inversely proportional to the signal-to-noise ratio:[1]


(3)
s \ \propto \ \frac {\mbox{noise}}{\mbox{signal strength}}


Figure 2. When repeated measuements are distributed about their mean according to a Gaussian, then 68.3% of the measured values lie within one standard deviation of their mean, while 95.5% lie within two standard deviations.
Figure 2. When repeated measuements are distributed about their mean according to a Gaussian, then 68.3% of the measured values lie within one standard deviation of their mean, while 95.5% lie within two standard deviations.

Consequently, large values of s imply a relatively large amount of noise—measured values are widely scattered about their mean. We attempt to increase the signal-to-noise ratio by increasing N. Note that if we make only one measurement (N = 1), then (2.5) gives s = 0/0; that is, a single measurement gives us no information about the width of the distribution.

If we integrate portions of the Gaussian, we obtain the number of values that lie within a specified distance from the mean. In particular, we find that, as in Figure 2,

68.3% of the measured values lie within ±1s of the mean ym,
95.5% lie within two standard deviations ±2s of ym, and
99.7% lie within three standard deviations ±3s of ym.

Note in Figure 2 that the maximum in the Gaussian distribution occurs at the mean value; that is, the mean is the most probable value—the value most likely to be measured.


Reference

  1. J. M. Haile, Analysis of Data, Macatea Productions, Central, SC, 2003. ISBN 0-9728602-0-7.
Personal tools