Confidence Intervals are intervals that estimate a population parameter.
For example, it is very difficult to collect every data value to calculate the population mean \(\mu\), or standard deviation \(\sigma\). But, we can estimate the parameters using confidence intervals.
Confidence interval for a population mean with normal distribution: \(\sigma\) is known: \[\mu=\bar{x} \pm (z_{\alpha/2})\left( \frac{\sigma}{\sqrt{n}}\right)\] \(\sigma\) is unknown: \[\mu\approx \bar{x} \pm (z_{\alpha/2})\left( \frac{s}{\sqrt{n}}\right)\] where \(z_{\alpha/2}\) is the z-value corresponding to an area \(\alpha/2\) in the tail of a standard normal distribution. Conditions Required:
|
The confidence level is the probability that an interval estimator encloses the population parameter. For example, a 95% confidence level represents that there is a 95% chance that the confidence interval contains the population mean.
Common Misinterpretations:
It does not mean that there is a 95% chance that the population mean lies within the interval.
It also does not mean that the confidence interval contains 95% of the data values.
Commonly used confidence levels.
Confidence Level \(100(1-\alpha)\%\) | \(\alpha\) | \(\alpha/2\) | \(z_{\alpha/2}\) |
90% | 0.10 | 0.05 | 1.645 |
95% | 0.05 | 0.025 | 1.960 |
99% | 0.01 | 0.005 | 2.575 |
Example: A random sample of 100 observations from a normally distributed population possesses a mean equal to 83.2 and a standard deviation equal to 6.4. Find a 95% and 99% confidence interval for \(\mu\).
Solution:
Since the sample is large (n=100) and the distribution is normally distributed we can use the formula, we can calculate the confidence interval for the population mean using the z-table. The sample standard deviation is given (s=6.4).
95% confidence interval. \(z_{0.05/2}=1.960\)
\begin{align} \mu &\approx \bar{x} \pm (z_{0.05/2})\left( \frac{s}{\sqrt{n}}\right) \\ &\approx 83.2 \pm 1.960\left( \frac{6.4}{\sqrt{100}}\right) \\ &\approx 83.2 \pm 1.2544 \\ &\approx (81.9456,84.4544)\end{align}
There is a 95% chance that the interval (81.9456,84.4544) contains \(\mu\).
99% confidence interval. \(z_{0.01/2}=2.575\)
\begin{align} \mu &\approx \bar{x} \pm (z_{0.01/2})\left( \frac{s}{\sqrt{n}}\right) \\ &\approx 83.2 \pm 2.575\left( \frac{6.4}{\sqrt{100}}\right) \\ &\approx 83.2 \pm 1.648 \\ &\approx (81.552,84.848)\end{align}
There is a 99% chance that the interval (81.552,84.848) contains \(\mu\).
For smaller distributions, the z-statistic is no longer an accurate measure because the small number of samples does not ensure that the distribution is normal. However, we can use the t-statistic to approximate a normal distribution.
Confidence Interval for small samples: If \(\sigma\) is known, you can still use the z-statistic. If \(\sigma\) is unknown. \[\mu \approx \bar{x} \pm t_{\alpha/2}\left(\frac{s}{\sqrt{n}}\right)\] where \(t_{\alpha/2}\) is the t-value corresponding to an area \(\alpha/2\) in the upper tail of the Students' t-distribution based on \((n-1)\) degrees of freedom. |
Conditions to use t-statistic in confidence interval:
Example: Suppose you have selected a random sample of \(n=13\) measurements from a distribution that is approximately normal. The sample reported \(\bar{x}=53.4g\) and \(s=8.6g\). Find the 98% confidence interval.
Solution:
Since the sample is small \(n=13\) and \(\sigma\) is not given. We will have to use the t-statistic.
98% confidence interval, degree of freedom = 12, \(\alpha=0.02/2=0.01\), \(t_{0.01} = 2.681\) from t-table
\begin{align} \mu &\approx \bar{x} \pm t_{0.02/2}\left(\frac{s}{\sqrt{n}}\right) \\ &\approx 53.4 \pm 2.681\left(\frac{8.6}{\sqrt{13}}\right) \\ &\approx 53.4 \pm 6.39475 \\ &\approx (47.00525,59.79475) \end{align}
There is a 98% chance that the interval (47.00525g,59.79475g) contains \(\mu\).
Sometimes your data is expressed as a proportion or fraction of successes, \(\hat{p}\), with proportional mean, \(p\).
Sampling Distribution of \(\hat{p}\)
|
Large-Sample Confidence Interval for p \[p \approx \hat{p} \pm z_{\alpha/2}\sigma_{\hat{p}} = \hat{p} \pm z_{\alpha/2}\sqrt{\frac{pq}{n}} \approx \hat{p} \pm z_{\alpha/2}\sqrt{\frac{\hat{p}\hat{q}}{n}}\] where \(\hat{p}=\frac{x}{n}\) and \(\hat{q}=1-\hat{p}\) |
Conditions Required for a Valid Large-Sample Confidence Interval for p:
Example: A random sample of size \(n=196\) yielded \(\hat{p}=0.64\). Construct a 95% confidence interval for p.
Solution:
\(\hat{p}=0.64, \hat{q}=1-0.64=0.36, z_{0.05/2}=1.96\)
\begin{align} p &\approx \hat{p} \pm z_{\alpha/2}\sqrt{\frac{\hat{p}\hat{q}}{n}}\\ &\approx 0.64 \pm 1.96 \sqrt{\frac{(0.64)(0.36)}{196}} \\ &\approx 0.64 \pm 0.0672 \\ &\approx (0.5728,0.7072) \end{align}
There is a 95% chance the interval (0.5728,0.7072) contains p.
Statistics by Matthew Cheung. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.