ECMT1010 · Introduction To Economic Statistics
Inference for Means: One-, Two-Sample & Paired
Week 9 does for quantitative data what the previous chapter did for proportions, but with the t-distribution (used whenever σ is unknown). It covers the CI and HT for one mean (t = (x̄ − μ₀)/(s/√n), df = n − 1), the difference in two means (df = min(n₁, n₂)), and paired data, where you test the per-pair differences. It is examined as short-answer — the decisive choice is independent vs paired, and the recurring exam point is why pairing raises power.
What this chapter covers
- 011. The t-distribution and degrees of freedom; why t (not z) when σ is unknown
- 022. CI for one mean: x̄ ± t*(df, α/2)·(s/√n), df = n − 1
- 033. HT for one mean: t = (x̄ − μ₀)/(s/√n) ~ t(n − 1)
- 044. Difference in two independent means: SE = √(s₁²/n₁ + s₂²/n₂), t = (x̄₁ − x̄₂)/SE
- 055. The conservative df rule for two means: df = min(n₁, n₂)
- 066. Paired data: form per-pair differences d, then run a one-sample t on d̄
- 077. Independent vs paired: how to tell which design you have from the question
- 088. Why pairing removes between-subject variability, shrinks the SE and raises power
A paired t-test and why pairing helps
- 2 marksRecognise the paired design — each rep gives one (new − old) difference — and set up a one-sample t-test on the differences. Let δ = the population mean difference. H₀: δ = 0 versus Hₐ: δ > 0.
- 1 markCompute the mean difference: d̄ = (3 − 1 + 5 + 2 + 0 + 6 + 1)/7 = 16/7 ≈ 2.286.
- 2 marksCompute the SD of the differences: deviations from 2.286 are 0.714, −3.286, 2.714, −0.286, −2.286, 3.714, −1.286; squared sum ≈ 0.51 + 10.80 + 7.37 + 0.08 + 5.22 + 13.80 + 1.65 ≈ 39.43; s_d = √(39.43/6) = √6.572 ≈ 2.564.
- 1 markCompute the test statistic: t = d̄/(s_d/√n) = 2.286/(2.564/√7) = 2.286/0.969 ≈ 2.36, df = 6.
- 1 markDecision and conclusion: the one-sided 5% critical value t(6) ≈ 1.943; since 2.36 > 1.943, reject H₀ — there is significant evidence the new method's mean error is higher. Pairing helps because it removes the between-rep variability, shrinking the SE and raising the test's power.
Key terms
- t-distribution
- A bell-shaped distribution with fatter tails than N(0,1), used for inference about means when the population SD σ is unknown. Its shape is set by the degrees of freedom; as df grows it approaches the normal.
- Degrees of freedom (df)
- The parameter that fixes the t-distribution: df = n − 1 for one mean, df = min(n₁, n₂) for two independent means (the unit's conservative rule), and df = n − 1 for paired differences.
- CI for one mean
- An interval x̄ ± t*(df, α/2)·(s/√n) estimating the population mean μ, using the t critical value for the chosen confidence level and df = n − 1.
- One-sample t-test
- A test of H₀: μ = μ₀ using t = (x̄ − μ₀)/(s/√n) on n − 1 degrees of freedom, compared against the bound t table for the right tail.
- Two independent means
- A comparison of x̄₁ and x̄₂ from separate groups, with SE = √(s₁²/n₁ + s₂²/n₂) and t = (x̄₁ − x̄₂)/SE on df = min(n₁, n₂).
- Paired data
- Two measurements on the same subject (before/after, or two methods on one item). You analyse the per-pair differences with a one-sample t, which removes between-subject variability and raises power.
Inference for Means: One-, Two-Sample & Paired FAQ
When do I use t instead of z?
Use t whenever you are doing inference about a mean and the population standard deviation σ is unknown — which is essentially always with real data, because you only have the sample SD s. The t-distribution has fatter tails than z to account for the extra uncertainty from estimating σ, and it depends on the degrees of freedom. In ECMT1010 the means chapter is entirely t-based; z is reserved for proportions and for large-sample normal approximations.
How do I tell whether data are paired or independent?
Ask whether each value in one group is naturally matched to a specific value in the other. If the two measurements come from the same subject or item — before-and-after, two treatments on the same person, twins, matched pairs — the data are paired, and you analyse the per-pair differences. If the two groups are separate sets of subjects with no natural matching, the data are independent and you compare the two means directly. Getting this wrong changes the entire test.
Why does pairing increase power?
Because it removes the variation between subjects. When you take a difference within each subject, any baseline difference between subjects (one person naturally forecasts higher than another, say) cancels out, leaving only the effect you care about plus noise. That makes the standard error of the mean difference smaller, the t-statistic larger for the same true effect, and so the test more likely to detect a real difference — that is higher power.
What degrees of freedom do I use for two means?
For two independent means, ECMT1010 uses the conservative rule df = min(n₁, n₂) — the smaller of the two sample sizes — so you can read the t table by hand. For paired data you are really running a one-sample test on the differences, so df = n − 1 where n is the number of pairs. For a single mean it is also df = n − 1. State the df explicitly, because the marker checks you read the table at the right line.
Exam move
First decide the structure of the data before you write any formula: one mean, two independent means, or paired. The paired-versus-independent call is the highest-value decision in this chapter and the S2 2024 exam famously runs both tests on the same data and then asks why they differ — so practise collapsing paired data to a single column of differences and stating the 'pairing removes between-subject variability, shrinks the SE, raises power' line. Always write the df explicitly (n − 1, or min(n₁, n₂)) because the marker checks you used the right table line. Run every test through the full ritual — parameter, H₀/Hₐ, formula, substitution, t value, table critical value with df, decision, context sentence — and use t (not z) whenever σ is unknown.