University of Melbourne · S1 2026 · FACULTY OF SCIENCE

MAST90105 · Methods Of Mathematical Statistics

Q: What is the difference between the significance level and the power?

The significance level α is the probability of a Type I error — rejecting a true null — which you fix in advance, commonly at 5%. Power is 1 − β, the probability of correctly rejecting a false null. You control α by the cut-off; power then depends on the true effect size, the sample size and the variability. For a fixed n there is a trade-off: shrinking α reduces power.

Q: What does the Neyman–Pearson lemma give me?

It identifies the best possible test for a simple null against a simple alternative: among all tests with significance level α, the one that rejects when the likelihood ratio exceeds a threshold has the highest power. That is why the standard t, χ² and z tests are the right ones — they are (or approximate) likelihood-ratio tests — and why ‘reject when the likelihood ratio is large’ is the organising principle of the chapter.

Q: How are confidence intervals and tests related?

They are two views of the same inference. A two-sided level-α test of H0: θ = θ0 rejects exactly when θ0 falls outside the (1−α) confidence interval for θ. So you can read a test off an interval and vice versa, which is a fast way to check your work — if the null value sits inside the 95% CI, a 5% two-sided test does not reject.

- one subject, every graph, every model, every mark

50% final exam · hurdle14 Chapters3-page Bible

Our own words - no uploaded lecturer files

Built to mirror S1 2026 · updated this semester

Chapter 9 of 10 · MAST90105

Hypothesis Testing and Neyman–Pearson

A hypothesis test pits a null H₀ against an alternative H_a and uses the data to decide between them, accepting two kinds of error: a Type I error (reject a true null, with probability α — the significance level) and a Type II error (fail to reject a false null, probability β); the test’s power is 1 − β, its chance of catching a real effect. The chapter’s theoretical centrepiece is the Neyman–Pearson lemma: for a simple-versus-simple test, the most powerful test at level α rejects when the likelihood ratio exceeds a threshold — the optimality result that justifies likelihood-ratio tests and the standard statistics that follow. Those standard tests are then assembled from the sampling distributions built earlier: the one-sample t for a mean (σ unknown), the χ² test for a variance, and the large-sample z test for a proportion. Throughout, the chapter ties testing to intervals (a level-α two-sided test rejects exactly when the parameter falls outside the (1−α) CI) and keeps the p-value interpretation honest.

In this chapter

What this chapter covers

019.1 Null vs alternative; the logic of a test
029.2 Type I / Type II errors, α, β and power
039.3 The Neyman–Pearson lemma and the most powerful test
049.4 Likelihood-ratio tests
059.5 The standard tests: one-sample t, variance χ², proportion z
069.6 The test–CI duality and the p-value

Worked example · free

Worked example: a one-sample t-test for a mean

Q [6 marks]. A machine should fill bottles to 500 ml. A sample of n = 25 gives X̄ = 496 ml with S = 10 ml. Test H₀: μ = 500 against H_a: μ ≠ 500 at the 5% level. (t_0.025,24 = 2.064.)

+1Hypotheses and test. H₀: μ = 500 vs H_a: μ ≠ 500 (two-sided). With σ unknown, use the one-sample t-statistic with n−1 = 24 df.
+1Standard error. S/√n = 10/√25 = 10/5 = 2.
+1Test statistic. t = (X̄ − μ₀)/(S/√n) = (496 − 500)/2 = −2.0.
+1Critical value. Two-sided 5% with 24 df: reject if |t| > t_0.025,24 = 2.064 (provided table).
+1Decide. |−2.0| = 2.0 < 2.064, so we do NOT reject H₀ at the 5% level.
+1Interpret. The evidence that the mean fill differs from 500 ml is not quite significant at 5% — the observed 4 ml shortfall is within sampling noise for this n and S. (Equivalently, 500 lies just inside the 95% CI.)

t = −2.0, which does not exceed the critical 2.064, so H₀: μ = 500 is not rejected at the 5% level. The recipe: state the hypotheses, form the standardised statistic, compare to the table quantile, and translate the decision back into context.

Glossary

Key terms

Type I and Type II errors: A Type I error rejects a true null (probability α, the significance level); a Type II error fails to reject a false null (probability β). Lowering α raises β for fixed n — the fundamental trade-off a test balances.
Power: 1 − β, the probability a test correctly rejects a false null — its sensitivity to a real effect. Power rises with sample size, effect size and α, and falls with variability. The Neyman–Pearson test maximises it at fixed α.
Neyman–Pearson lemma: For a simple null versus a simple alternative, the most powerful level-α test rejects when the likelihood ratio L(θ₁)/L(θ₀) exceeds a constant chosen to give size α. It is the optimality theorem behind likelihood-ratio tests and the usual statistics.
Likelihood-ratio test (LRT): A general test that rejects for small values of the ratio of the maximised likelihood under H₀ to the maximised likelihood overall; −2·ln of that ratio is approximately χ² in large samples (Wilks). It extends Neyman–Pearson to composite hypotheses.
p-value: The probability, computed under H₀, of a test statistic at least as extreme as the one observed. Reject when it is below α. It is not the probability that H₀ is true — a routinely mis-stated definition.

FAQ

Hypothesis Testing and Neyman–Pearson FAQ

What is the difference between the significance level and the power?

The significance level α is the probability of a Type I error — rejecting a true null — which you fix in advance, commonly at 5%. Power is 1 − β, the probability of correctly rejecting a false null. You control α by the cut-off; power then depends on the true effect size, the sample size and the variability. For a fixed n there is a trade-off: shrinking α reduces power.

What does the Neyman–Pearson lemma give me?

It identifies the best possible test for a simple null against a simple alternative: among all tests with significance level α, the one that rejects when the likelihood ratio exceeds a threshold has the highest power. That is why the standard t, χ² and z tests are the right ones — they are (or approximate) likelihood-ratio tests — and why ‘reject when the likelihood ratio is large’ is the organising principle of the chapter.

How are confidence intervals and tests related?

They are two views of the same inference. A two-sided level-α test of H₀: θ = θ₀ rejects exactly when θ₀ falls outside the (1−α) confidence interval for θ. So you can read a test off an interval and vice versa, which is a fast way to check your work — if the null value sits inside the 95% CI, a 5% two-sided test does not reject.

Study strategy

Exam move

Fix the four-line test template in memory — hypotheses, standardised statistic, table critical value, decision plus interpretation — and run it for the three workhorse tests (mean t, variance χ², proportion z) until the right statistic jumps out from the cue. Understand the Neyman–Pearson lemma well enough to derive a simple-versus-simple rejection region from the likelihood ratio, since that is the chapter’s theory mark. Keep α, β and power straight and be precise about the p-value (it is computed under H₀, and is not P(H₀ true)). Use the test–CI duality to cross-check decisions, and remember the provided table supplies the quantiles but never the setup.

Keep going — explore the course

A+Everything unlocked

Unlocks this Bible + all 72 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.

Sia - your MAST90105 tutor, unlimited, worked the way the exam marks it

The full 3-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works