Most of the time the population mean differs from the mean of each
individual sample taken from the population.
Consider the following example where absorbance of a solution
containing a known concentration of substance A was determined by U.V./Visible
spectrometer. The absorbance of the solution was measured three times during
each experiment and the average value, standard deviation s and 2s was
calculated (see Table I.1).
Table I.1: Absorbance values measured for a solution A
using a U.V./Visible spectrometer. Three consecutive measurements were recorded
during each experiment.
The mean values
from Table I.1 were plotted in the graph shown in Fig. 1 below and 2s (2 *
standard deviation s) is shown for each mean. As can you see, the mean value
calculated for the absorbance of A in each experiment (sample of the
population) differs from the mean of the population (red line). Please also
note that the interval - solid line above and below the mean in each experiment
(Fig. 1) - of each mean (the range of values the mean can
take with a certain probability) does not always contain the population mean
(experiments 5, 9, 10, 13).
|
Fig 1: Absorbance values obtained by measuring a solution of substance A with known concentration C using a U.V/Visible spectrometer. Each point shown on the graph is the mean of three consecutive measurements of the solution of substance A with concentration C. |
From the above
discussion the following question arises:
How can we assess the accuracy of the population mean? Within
wich boundaries the true value of the population mean is contained?
Such boundaries are
called confidence intervals or confidence levels. Confidence
intervals in a sense give us the range of values that the population mean can
take with a certain degree of confidence – usually 90%, 95% or 99%.
Most of the time
we look at 95% confidence intervals but all of them have similar
interpretation:
they are limits
constructed such that a certain percentage of the time 95% in this case the
value of the population mean will fall within these limits.
How can we calculate confidence intervals?
In order to calculate the confidence interval, we need to
know the limits within which 95% of means will fall. If we will assume a normal
distribution with a mean = 0 and
s = 1
we can use the
z-scores with values between -1.96 and
+1.96 (remember that 95% of z-scores fall
between these two values). Remember also that we can convert values to z-scores
using the formula:
z = (x - x̅ )/ s
(1)
If we know that the upper limit will be z = +1.96 then from (1) we
get:
(x - x̅ )/ s = 1.96 and x
= x̅ + 1.96 * s (this is the upper boundary – limit)
and
(x - x̅ )/ s = -1.96 and x
= x̅ - 1.96 * s (this is the lower boundary – limit)
Therefore, the confidence interval can easily be calculated once the
standard deviation s of the mean and the mean are known. The general form of
the confidence interval is given below:
x = x̅ ± zcritical * s (2)
where x is the upper or lower value the mean of the population can take with a certain degree of confidence, x̅ is the mean value of the population of measurements, zcritical is the z critical value from statistical tables (see Table I.2) at a certain confidence level (usually 95%) and s is the standard deviation of the measurements.
Confidence Level
(%)
|
z-critical value
|
99
|
2.58
|
95
|
1.96
|
90
|
1.645
|
50
|
0.675
|
Table I.2: Critical
values of z at different confidence levels