DATA1001 · Foundations Of Data Science
Hypothesis Testing
This is the capstone, and the course's whole back half collapses into one idea: every test is the same standardised distance, (OV−EV)/SE, read against a reference curve — only the EV, the SE and the curve change. Wrapped around that engine is HATPC — Hypotheses, Assumptions, Test statistic, P-value, Conclusion — the literal exam scaffold graders reward line by line. The chapter teaches the engine once, then walks the test zoo: the z-test for a proportion, the one-sample, paired and two-sample t-tests (with their assumption checks), chi-square for goodness-of-fit and independence, and the regression-slope test — showing each is the same calculation with a different EV, SE and curve. It closes on the literacy that earns the last marks: reading a p-value correctly (it is not the probability the null is true), and distinguishing Type I from Type II errors. Internalise the engine and the scaffold and fresh exam numbers cannot surprise you.
What this chapter covers
- 01HATPC: the exam scaffold graders reward line by line
- 02The universal engine: (OV − EV)/SE against a reference curve
- 03The test zoo: z for a proportion, one-sample / paired / two-sample t
- 04Chi-square (goodness-of-fit, independence) and the regression-slope test
- 05P-value literacy and Type I vs Type II errors
Worked example: a one-sample t-test by HATPC
- +1H: H₀: μ = 500; H₁: μ ≠ 500 (two-sided).
- +1A: the 16 bottles are a random sample; with σ unknown and n small, use the t-distribution with 15 df (fill volumes roughly Normal).
- +2T: OV = 496; EV = 500; SE = s/√n = 6/√16 = 1.5; t = (496 − 500)/1.5 = −2.67.
- +1P: with 15 df, two-sided |t| = 2.67 gives p ≈ 0.018.
- +1C: 0.018 < 0.05, so reject H₀ — there is evidence the machine is under-filling (mean about 496 ml, 4 ml below target).
Key terms
- HATPC
- The course's exam scaffold for any test: Hypotheses, Assumptions, Test statistic, P-value, Conclusion. Graders reward each line, so writing all five — especially the assumptions and the in-context conclusion — banks marks that a bare answer misses.
- The (OV−EV)/SE engine
- The single test statistic behind every method: the observed value minus its expected value under the null, divided by the standard error. Every test — z, t, slope, chi-square — is this same standardised distance with a different EV, SE and reference curve.
- P-value
- The probability, computed assuming the null is true, of a test statistic as extreme as the one observed. Small p means the data are surprising under the null. It is not the probability the null is true, and not the probability the result happened by chance.
- Type I and Type II error
- A Type I error is rejecting a true null (a false positive), with probability α (the significance level). A Type II error is failing to reject a false null (a false negative), with probability β; power = 1 − β. Lowering α raises β for fixed n.
- Reference curve
- The distribution the test statistic is compared against to get a p-value: the standard Normal for z-tests, the t-distribution (with its degrees of freedom) for t-tests, and the chi-square distribution for chi-square tests. Choosing the right curve is part of choosing the right method.
Hypothesis Testing FAQ
What is HATPC and why use it?
HATPC is the five-step scaffold for any hypothesis test: Hypotheses, Assumptions, Test statistic, P-value, Conclusion. DATA1001 grades it line by line, so writing every step — including the assumption check and a conclusion stated in context — earns marks that a one-line "reject H₀" leaves on the table. It also stops you skipping the assumptions, which is where wrong-method answers come from.
Why is every test "the same"?
Because every test is the standardised distance (OV−EV)/SE: the observed statistic minus what you'd expect under the null, in units of the standard error. The proportion z-test, the one-sample/paired/two-sample t-tests, the slope test and (in a related form) chi-square all compute that ratio — only the EV, the SE and the reference curve differ. Learn the engine once and you've learned the whole test zoo.
How do I read a p-value without the classic mistakes?
A p-value is the probability of data as extreme as yours if the null were true. A small p-value means the data are surprising under the null, so you have evidence against it. It is not the probability that the null is true, not the probability your result was "due to chance", and a large p-value is not proof the null is true — it just means the data are consistent with it. State the conclusion in context, with the size and direction of the effect.
When do I use z versus t, and what about chi-square?
Use z when the standard error is known or the sample is large (a proportion test, or a mean with known σ); use t when σ is estimated from a small sample, with n−1 degrees of freedom and the matching assumption checks. Use chi-square for counts in a table — goodness-of-fit (do observed counts match expected?) or independence (are two categorical variables associated?). The slope test is just a t-test on the regression slope.
Exam move
Make HATPC and the (OV−EV)/SE engine muscle memory: write all five letters every time, because the assumptions line and the in-context conclusion are where marks are banked. For each method, know three things — what goes in EV, what goes in SE, and which reference curve (z, t with its df, or chi-square) — and recognise that proportion, t, slope and chi-square tests are the same machine with different settings. Spend real effort on p-value literacy and Type I/II errors, since the final marks reward the interpretation, not the arithmetic: a small p-value is evidence against the null, never the probability the null is true, and every conclusion must name the effect in the context of the question.