ECMT1010 · Introduction To Economic Statistics
Confidence Intervals & the Bootstrap
Weeks 3–4 move from describing a sample to estimating a parameter with uncertainty attached. The key objects are the sampling distribution, its standard deviation the standard error (SE), and the confidence interval statistic ± 2·SE, plus the bootstrap — the simulation trick that estimates the SE from a single sample by resampling with replacement. It is examined as MCQ and short-answer: build a 95% CI from a bootstrap SE, interpret it correctly, and know when the bootstrap fails.
What this chapter covers
- 011. The sampling distribution: how a statistic varies across all possible samples
- 022. Standard error (SE) = the SD of the sampling distribution; bigger SE = less reliable
- 033. The interval estimate = statistic ± margin of error (ME)
- 044. The 95% confidence interval = statistic ± 2·SE, and the general statistic ± z*·SE
- 055. Correct vs incorrect interpretation of a confidence level (the procedure, not the parameter)
- 066. The bootstrap: resample WITH replacement to build a bootstrap distribution and read off the SE
- 077. Larger n → smaller SE → narrower CI (but more resamples does not shrink the SE)
- 088. The percentile method and when the bootstrap fails (skewed/spiky distributions)
A 95% bootstrap confidence interval for a proportion
- 1 markCompute the point estimate: p̂ = 185/500 = 0.370.
- 2 marksApply the 95% rule, statistic ± 2·SE: CI = 0.370 ± 2(0.021) = 0.370 ± 0.042.
- 1 markWrite the interval: (0.328, 0.412), i.e. 32.8% to 41.2%.
- 1 markInterpret: we are 95% confident that the interval 32.8% to 41.2% contains the true proportion of commuters who would switch.
Key terms
- Sampling distribution
- The distribution of a statistic (like x̄ or p̂) across all possible samples of a given size. It is centred on the parameter when sampling is random and becomes bell-shaped for large n.
- Standard error (SE)
- The standard deviation of the sampling distribution — a measure of how much the statistic bounces from sample to sample. A larger SE means a less reliable estimate; SE shrinks as the sample size grows.
- Confidence interval
- An interval estimate of the form statistic ± z*·SE (or ±2·SE for 95%). Over repeated sampling it captures the true parameter a stated proportion of the time.
- Margin of error (ME)
- The ± part of a confidence interval, ME = z*·SE (= 2·SE for 95%). It sets the half-width of the interval and shrinks with a smaller SE, i.e. with a larger sample.
- Bootstrap
- A simulation method that estimates the SE from one sample by resampling it WITH replacement many times, computing the statistic each time, and taking the standard deviation of those bootstrap statistics.
- Percentile method
- Building a CI directly from the bootstrap distribution by reading off the lower and upper percentiles (e.g. the 2.5th and 97.5th for 95%). It is only valid when the bootstrap distribution is smooth and roughly symmetric.
Confidence Intervals & the Bootstrap FAQ
What exactly does '95% confident' mean?
It is a statement about the method, not about this one interval. If you repeated the whole sampling-and-interval procedure many times, about 95% of the intervals you produced would contain the true parameter. The parameter is a fixed (unknown) number, so it is either in your particular interval or it is not — there is no probability about it. Always phrase the interpretation as 'we are 95% confident the interval contains…'.
How does the bootstrap work and why is it useful?
The bootstrap treats your single sample as a stand-in for the population: you draw a new sample of the same size from it WITH replacement, compute the statistic, and repeat thousands of times. The spread (standard deviation) of those bootstrap statistics estimates the standard error, which you then use in statistic ± 2·SE. It is useful because it gets you an SE — and hence a CI — without needing a formula or a normality assumption.
Does taking more bootstrap resamples make the interval narrower?
No. More resamples make your estimate of the SE more precise (the bootstrap distribution gets smoother), but they do not change the underlying SE, so the interval does not get narrower. The only way to genuinely shrink the SE — and the interval — is to collect a larger original sample n, because SE falls roughly like 1/√n.
When does the bootstrap fail?
The percentile and ±2·SE methods assume the bootstrap distribution is smooth and roughly symmetric. For statistics whose sampling distribution is skewed or spiky — such as a correlation near ±1, a median, or a small-sample statistic — the bootstrap distribution can be lopsided or gappy and the resulting CI is unreliable. The rule is always to plot the bootstrap distribution first and only trust the interval if it looks bell-shaped.
Exam move
Anchor everything to one chain: sample → statistic → sampling distribution → SE → CI. Be able to state in words what the SE is (the SD of the sampling distribution) and how the bootstrap estimates it (resample with replacement, take the SD of the bootstrap statistics). Drill the mechanical CI in seconds — point estimate, ±2·SE, write the interval — and then spend your effort on the interpretation sentence, because that is where students drop marks: say it is about the long-run procedure, never a probability for the fixed parameter. Learn the two 'gotcha' MCQs cold: larger n shrinks the SE but more resamples do not, and the bootstrap needs a smooth, symmetric distribution to be trusted, so always plot it first.