Chapter 7 : Confidence Intervals 

In this chapter we will learn about confidence intervals for the population mean μ and population proportion p. We will also look at the formula for finding the sample size (as it is used by polling companies).

[7.1] z-based CI for μ

Section [7.1] involve the analysis of confidence intervals for the population mean when σ is known. Chapter 9 in Visual Statistics illustrated these concepts nicely. We will look at a few cases, e.g.,

A 95% confidence interval ("Take New Sample") and,

Replication of 100 CIs where approximately 95% contain the true mean μ. ("Show 4 Displays > Control Panel > Replicate Experiment".)

[7.2] t-based CI for μ and Student's t-distribution

When the population variance is not known, we estimate it using the sample variance. The CI is now found using the t-distribution since the sample means are no longer normal, but they are t-distributed with mean zero and the variance that depends on its "degrees of freedom" (its parameter).

Student's t-distribution was developed by William S. Gossett who published his work under the pen name Student.

Here is a link to this distribution. Note that, t-distribution is always lower that the standard normal, but the former (t) approaches the latter (z) as the sample size increases. For example, z 0.05 = 1.645, i.e., there is a probability of .05 above 1.645. From the t-table we see that for df higher than 30, t 0.05 is also approximately 1.645.

t-table : This table is similar to the z-table and calculates the t-score above which there is a probability of α.

 

Internet exercise: Pb. 7.78, p. 235

Albuquerque home prices : This data set contains a sample of 117 home prices (Feb. 15 to Apr. 30, 1993) obtained from here. Selling price is in $100.

The Excel file : This file can be found on the CD. The sample mean is $106,200, and the 90% CI is from $100,440 to $112,100.

 

[7.3] Calculation of the Sample Size (Poll Results)

Nanos Poll

[7.4] Confidence Intervals for the Proportion p

We will do this with the participation of the class and we will use an inflatable globe to estimate the proportion of the water surface to the total surface of the globe.

Here is the video of this experiment I recorded in Section C02 (November 4, 2010, Thursday).

[7.5] Comparing Two Population Means: Variances are known.

We will compare the effectiveness of the Atkins diet (also see this link) to a government recommended conventional diet involving the food pyramid by examining two samples in this Excel file. Here, we have two samples, n1 = 33 and n2 = 30 from each population and estimate a 95% CI for the difference μ1 - μ2.

The business decision problem here is this: A fitness centre (such as Curves, or BallyFitness) is considering providing dietary counselling by licencing Atkins diet in a promotion (for a licencing fee of $200,000). Should they pay for licence, or just recommend the food pyramid?

[7.6] Comparing Two Population Means: Variances are not known, but equal.

When the variances are not known (but assumed equal), we can then use the t-test to do the same. Later, we will test the hypothesis that the variances are in fact equal (or not).