MAST20034 · Critical Thinking With Data
Frameworks for Inference
Week 7 is where statistical inference appears — but, true to the subject, it is taught as interpretation, never calculation (there is no calculator on the exam). The question is: how can one sample speak for a whole population, and how confidently? You meet estimation — a point estimate versus a confidence interval — underpinned conceptually by the sampling distribution and the Central Limit Theorem (why sample means cluster normally around the truth). The high-value, high-trap content is interpreting a CI correctly and knowing precisely what it does not mean (it is not a 95% probability for this one interval, nor a range for individual values). You then walk the NHST reasoning chain — null vs alternative, the P-value and its common misreadings (it is not the probability the null is true), Type I vs Type II error and power — and close on the assumptions behind these tools and the frequentist vs Bayesian framing. Exam answers here are pure interpretation: read the output, state what it licenses, and avoid the classic misreads.
What this chapter covers
- 017.1 Estimation: point estimate vs confidence interval
- 027.2 The sampling distribution and the Central Limit Theorem (conceptually)
- 037.3 Interpreting a confidence interval — and what it does NOT mean
- 047.4 The NHST reasoning chain and the P-value (with the common misreadings)
- 057.5 Type I vs Type II error and statistical power
- 067.6 Assumptions, and frequentist vs Bayesian framing
Interpreting a confidence interval correctly, mark by mark
- +1Correct interpretation: we are 95% confident the interval [2, 8] captures the true mean effect — meaning that if we repeated the study many times, about 95% of the intervals so constructed would contain the true value.
- +1Name the error: the student commits the classic probability-of-the-parameter misreading — once computed, the interval either does or does not contain the fixed true value, so no probability attaches to this one interval.
- +1Apply the substance: because the whole interval lies above 0, the effect is statistically significant and positive (the drug lowers blood pressure by a plausibly 2–8 mmHg).
- +1Add the second guard: the CI is a range for the mean effect, NOT a range within which 95% of individual patients' responses fall — don't confuse the parameter with the data.
Key terms
- Confidence interval
- A range of plausible parameter values from a procedure that, over repeated sampling, captures the true value a stated fraction (e.g. 95%) of the time. Not a probability statement about one computed interval, and not a range for individual data points.
- Sampling distribution
- The distribution of a statistic (e.g. the sample mean) over all possible samples of a given size. It is the conceptual engine of inference — the spread that becomes the standard error.
- Central Limit Theorem
- The result that the sampling distribution of the mean approaches normal as sample size grows, whatever the population's shape (within limits). It is why normal-based intervals and tests work for means even on non-normal data.
- P-value
- The probability, IF the null hypothesis were true, of data at least as extreme as observed. NOT the probability the null is true, NOT the probability the result was due to chance, and NOT a measure of effect size.
- Type I / Type II error and power
- Type I = rejecting a true null (false positive, rate α); Type II = failing to reject a false null (false negative, rate β). Power = 1 − β, the chance of detecting a real effect; it rises with sample size and effect size.
Frameworks for Inference FAQ
What does a 95% confidence interval actually mean?
That the method produces intervals which capture the true parameter 95% of the time across repeated samples. For your one interval, the parameter is fixed and either inside or not — so you say ‘95% confident’, not ‘95% probability’.
What is a P-value not?
It is not the probability the null is true, not the probability the result is due to chance, and not the size or importance of an effect. It is only the chance of data this extreme assuming the null — pair it with an effect size and CI for meaning.
What's the difference between Type I and Type II error?
Type I rejects a true null (a false alarm, rate α, e.g. 0.05); Type II misses a real effect (a false negative, rate β). Power = 1 − β. Lowering α to avoid false alarms tends to raise β unless you increase the sample size.
Exam move
Because the exam tests interpretation, drill the correct readings and the named misreads side by side: the CI ‘procedure-over-repeats’ line and its two traps, and the P-value ‘assuming the null’ line and its three traps. Put a clean Type I / Type II / power table on your notes sheet — including which lever (α, effect size, n) moves which error. Understand the CLT conceptually (means go normal as n grows) without any formula. For any output prompt, answer in three beats: state what it licenses, then name the misread you are avoiding.