Data Analysis and Statistical Inference

In-class
problems on confidence intervals

Answers to conceptual questions on confidence intervals

Decide whether the following statements are true or false. Explain your reasoning.

Problems:

a) For a given standard error, lower confidence levels produce wider confidence intervals.

False. To get higher confidence, we need to make the
interval wider interval. This is evident in the multiplier, which
increases with confidence level.

b) If you increase sample size, the width of confidence intervals will increase.

False. Increasing the sample size decreases the
width of confidence intervals, because it decreases the standard error.

False. 95% confidence means that we used a procedure
that works 95% of the time to get this interval. That is, 95% of all
intervals produced by the procedure will contain their
corresponding parameters. For any one particular interval,
the true population percentage is either inside the interval or
outside the interval. In this case, it is either in between
350 and 400, or it is not in between 350 and 400. Hence, the
probabliity that the population percentage is in between those two
exact numbers is either zero or one.

True, as long as we're talking about a CI for a population
percentage. The standard error for a population percentage
has the square root of the sample size
in the denominator. Hence, increasing the sample size by a factor
of 4 (i.e., multiplying it by 4) is equivalent to multiplying the
standard
error by 1/2. Hence, the interval will be half as wide.
This also works approximately for population averages as long as the
multiplier from the t-curve doesn't change much when increasing the
sample size (which it won't if the original sample size is large).

e) Assuming the central limit theorem
applies,
confidence intervals are always valid.

By "valid," we mean that the confidence interval procedure has a 95%
chance of producing an interval that contains the population parameter.

False. The central limit theorem is needed for confidence
intervals
to be valid. However, it is also necessary that the data be
collected from random samples. Confidence intervals will not
remedy poorly collected data.

f) The
statement, "the 95% confidence
interval for the population mean is (350, 400)" means that 95% of the
population values are between 350 and
400.

False.
The confidence interval is a range of plausible values for the
population average. It does not provide a range for 95% of
the data values from the population. To find the percentage of
values in the population between 350 and 400, we need to look at a
histogram of the data values and determine what percentage of
observations are between 350 and 400.

g) If you
take large random samples over and over again from the same population,
and make 95% confidence intervals for the population average, about 95%
of the intervals should contain the population average.

True.
This is the definition of confidence intervals.

h) If you
take large random samples over and over
again from the same population, and make 95% confidence intervals for
the population average, about 95% of the intervals should contain the
sample average.

False.
The confidence interval is a range for the population average, not for
the sample average. In fact, every confidence interval contains
its corresponding sample average, because CIs are of the form:
sample avg. +/- multiplier SE. So, the sample average is right in
the middle of the CI.

i) It is necessary that the
distribution of the variable of interest follows a normal curve.

False. It is necessary that the distribution of the sample average follows a
normal curve. The data values of the variable, however, need not
follow a normal curve, because if the sample size is large enough the
central limit theorem for the sample average will apply.

j) A 95% confidence interval obtained
from a random sample of 1000 people has a better chance of containing
the population percentage than a 95% confidence interval obtained from
a random sample of 500 people.

False.
All 95% confidence
intervals have the property that they come from a procedure that has a
95% chance of yielding an interval that contains the true
value. The confidence interval method automatically
accounts for sample size in the standard error. A 95% CI
with n=1000 will be narrower than a 95% CI with n=500, but both CIs
will have 95% confidence of containing the population percentage.

k) If you make go through life making
99% confidence intervals for all sorts of population means, about 1% of
the time the intervals won't cover their respective population means.

True.
Since 99% of the intervals should contain the corresponding population
mean, 1% of them will not.