Australian National University · S1 2026 · FACULTY OF SCIENCE

STAT7038 · Regression Modelling

Q: When do I use a confidence interval and when a prediction interval?

Match the wording. 'The average score for students who study 7 hours' is a CI for the mean. 'Predict the score of one student who studies 7 hours' is a PI. Both are centred at ŷh, but the PI adds the new observation's own error (the +1 under the root) and is always wider. Picking the wrong interval, or dropping the +1, is the single most-marked simple-regression error.

Q: Why does F = t² in simple regression?

The overall F-test of β1 = 0 and the two-sided t-test of β1 = 0 are the same test, so their statistics satisfy F = t² exactly (here t = 27.68 gives t² = 766.1 = F). It is also true that R² equals the squared correlation r²xy. These three equivalences only hold in simple regression with a single slope.

Q: Should I use the t-distribution or the normal for the critical value?

Use tn−2, not the normal. With small n the t has heavier tails and a larger critical value, giving wider, honest intervals; as n grows t approaches the normal. The df is n−2 because two parameters were estimated. In the exam you read the critical value off the supplied t table for the right df.

- one subject, every graph, every model, every mark

50% final exam · hurdle14 Chapters5-page Bible

Our own words - no uploaded lecturer files

Built to mirror S1 2026 · updated this semester

Chapter 2 of 7 · STAT7038

Regression Inference

Once the line is fitted, inference cross-examines it. The chapter opens with the ANOVA decomposition: total variation in y splits exactly into the part the line explains and the part it leaves, SST = SSR + SSE, with degrees of freedom (n−1) = 1 + (n−2). From that identity come the two headline summaries — the overall F-test F = MSR/MSE of 'is the line worth anything?' and R² = SSR/SST, the share of variation explained. You then test individual coefficients with a t-test, t = b_j/se(b_j) on n−2 df, and build confidence intervals the same way. The chapter's signature distinction is the confidence interval for the mean response vs the prediction interval for a new observation: both centre on ŷ_h, but the PI carries an extra '+1' under the root and is always wider. It closes by teaching you to read the R output — mapping each cell of summary() and anova() to a formula, because the exam hands you the printout rather than R itself.

In this chapter

What this chapter covers

01The ANOVA identity SST = SSR + SSE and the df bookkeeping
02The overall F-test F = MSR/MSE
03R² = SSR/SST and why R² = r²xy in SLR (F = t²)
04t-tests for a coefficient and the five-step ritual
05Standard errors and confidence intervals for β₀, β₁
06CI for the mean response vs PI for a new observation — the '+1'
07Reading summary(lm) and anova(lm): recovering MSE and n

Worked example · free

Worked example: CI for the mean vs PI for a new observation

Q [6 marks]. Using the study-hours fit (ŷ = 44.417 + 3.833x, σ̂ = 0.897, n = 8, x̄ = 5.5, S_xx = 42, t₆(0.975) = 2.447), predict at x_h = 7 hours. (a) Give a 95% confidence interval for the mean score of students who study 7 hours. (b) Give a 95% prediction interval for one student who studies 7 hours. (c) Explain why they differ.

+1Point estimate (both). ŷ_h = 44.417 + 3.833(7) = 71.25 — the same centre for the CI and the PI.
+1(a) CI standard error. se = σ̂√(1/n + (x_h−x̄)²/S_xx) = 0.897√(1/8 + (1.5)²/42) = 0.379.
+1(a) CI. 71.25 ± 2.447(0.379) = (70.32, 72.18) — the interval for the mean response.
+1(b) PI standard error. se = σ̂√(1 + 1/n + (x_h−x̄)²/S_xx) = 0.897√(1 + 1/8 + (1.5)²/42) = 0.974.
+1(b) PI. 71.25 ± 2.447(0.974) = (68.87, 73.63) — for one new observation.
+1(c) Why wider. The PI adds the new point's own random error ε_new (the +1 under the root), so it is wider everywhere; both pinch at x_h = x̄ and flare out as you move away.

ŷ_h = 71.25 for both. CI for the mean = (70.32, 72.18); PI for a new observation = (68.87, 73.63). The PI is wider because it must also cover the new point's own error — the '+1' under the root.

Glossary

Key terms

ANOVA identity: The exact decomposition SST = SSR + SSE: total variation in y around its mean equals the variation the line explains (SSR) plus the residual variation (SSE). The degrees of freedom add the same way, (n−1) = 1 + (n−2). In R / Kutner, SST is printed as SSTO.
F-test (overall): Tests H₀: β₁ = 0 with F = MSR/MSE, which follows an F_{1, n−2} distribution under H₀. A large F (MSR far exceeding MSE) is evidence of a real linear relationship. In simple regression F equals the square of the slope's t-statistic.
Coefficient of determination (R²): R² = SSR/SST = 1 − SSE/SST, the share of variation in y the line explains, between 0 and 1. In simple regression it equals the squared correlation r²_xy. A high R² is not a goodness-of-fit certificate — pair it with the diagnostic plots.
Prediction interval: An interval for one new observation y_new at x_h, centred at ŷ_h but with an extra σ² (the '+1' under the root) for the new point's own random error. It is always wider than the confidence interval for the mean response at the same x_h.
Standard error of a coefficient: se(b₁) = √(MSE/S_xx) and se(b₀) = √(MSE(1/n + x̄²/S_xx)). The standardised statistic (b_j − β_j)/se(b_j) follows t_n−2, which gives both the t-test and the confidence interval.

FAQ

Regression Inference FAQ

When do I use a confidence interval and when a prediction interval?

Match the wording. 'The average score for students who study 7 hours' is a CI for the mean. 'Predict the score of one student who studies 7 hours' is a PI. Both are centred at ŷ_h, but the PI adds the new observation's own error (the +1 under the root) and is always wider. Picking the wrong interval, or dropping the +1, is the single most-marked simple-regression error.

Why does F = t² in simple regression?

The overall F-test of β₁ = 0 and the two-sided t-test of β₁ = 0 are the same test, so their statistics satisfy F = t² exactly (here t = 27.68 gives t² = 766.1 = F). It is also true that R² equals the squared correlation r²_xy. These three equivalences only hold in simple regression with a single slope.

Should I use the t-distribution or the normal for the critical value?

Use t_n−2, not the normal. With small n the t has heavier tails and a larger critical value, giving wider, honest intervals; as n grows t approaches the normal. The df is n−2 because two parameters were estimated. In the exam you read the critical value off the supplied t table for the right df.

How do I recover MSE and n from an R printout?

The 'Residual standard error: 0.8975 on 6 degrees of freedom' line gives both: MSE = (residual SE)² = 0.806, and df_E = n−2 = 6 so n = 8. The F line 'on 1 and 6 DF' confirms it. From n you can rebuild any standard error the printout hides, which is exactly the skill the calculation questions test.

Study strategy

Exam move

Run every test as a five-step ritual — hypotheses, statistic, critical value (with df), decision, conclusion in context — because the written parts award method marks for the lines, not just the answer. Keep three chains on your sheet: SST = SSR + SSE → F and R²; se(b_j) → t → decision → CI; and x_h → CI (mean) or PI (new obs), writing the '+1' that distinguishes them. Then practise reading the R printout cold: map Estimate → b_j, Std. Error → se, t value → b/se, Pr(>|t|) → two-sided p (halve for one-sided), and recover MSE = (residual SE)² and n from the degrees of freedom. For one-sided alternatives, use the one-sided critical value and halve R's two-sided p-value.

Keep going — explore the course

A+Everything unlocked

Unlocks this Bible + all 8 of your Australian National University subjects - and 1,000+ Bibles across every Australian university.

Sia - your STAT7038 tutor, unlimited, worked the way the exam marks it

The full 5-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works