Australian National University · S1 2026 · FACULTY OF SCIENCE

STAT7038 · Regression Modelling

Q: How is STAT7038 assessed?

The final exam is 70% of your grade; the rest is an R-based assignment (about 15%, non-redeemable), an in-tutorial quiz (about 10%, redeemable) and an online quiz (about 5%, redeemable). Confirm this year’s exact split and dates in your subject guide, as details shift between cohorts.

Q: What is allowed in the STAT7038 exam?

The final is open to one A4 double-sided, typed or printed notes sheet, and the paper itself supplies the calculator, the R outputs and the statistical tables (t, F, normal). So you do not waste sheet space on table values or R syntax — spend it on the boxed formulas, the five-step test ritual, the CI-vs-PI rule and the R-output map. Confirm the permitted-materials rule for your sitting in the subject guide.

Q: Do I need to run R in the STAT7038 exam?

No — you read supplied summary() and anova() printouts rather than running R yourself. The skill being tested is mapping each cell (Estimate, Std. Error, t value, Pr(>|t|), the F-statistic) to a formula and recovering hidden quantities such as MSE and n.

- one subject, every graph, every model, every mark

50% final exam · hurdle7 Chapters43-page Bible

Our own words - no uploaded lecturer files

Built to mirror S1 2026 · updated this semester

The Complete Exam Bible · S1 2026

Regression Modelling

— one line through a cloud of points — and everything you can infer from it

Regression Modelling teaches the classical linear model end to end — from fitting one straight line through a cloud of points to building, testing and choosing a multiple-regression model with several predictors. You will estimate coefficients by least squares, split the variation with ANOVA, test slopes with t and the model with F, separate a confidence interval for the mean from a prediction interval for a new observation, check the assumptions with residual diagnostics, read everything off an R printout, and finish with model selection. The final exam is 70% of your grade; it is open to one A4 double-sided typed sheet and supplies the calculator, the R outputs and the statistical tables — so it tests whether you can execute and interpret the method on fresh numbers, not whether you can recall it.

STAT7038 · Australian National University

Contents · the whole subject, one map

What STAT7038 covers

Seven topics, from one straight line to many predictors → one exam-ready map. Each links to its free chapter guide.

01Simple Linear RegressionThe model · LINE assumptions · least squares · b₁ = Sxy/Sxx · σ² = MSE 02Regression InferenceANOVA · the F-test & R² · t-tests & CIs · CI vs PI · reading R output 03Regression DiagnosticsResiduals vs fitted · Q–Q · outlier ≠ leverage ≠ influence · transformations 04Matrix Form & Multiple Regressiony = Xβ · (X'X)⁻¹X'y · the hat matrix · partial coefficients · adjusted R² 05Multiple Regression InferenceSequential vs partial SS · the nested F-test · dummies & interactions 06Advanced DiagnosticsStudentised & PRESS residuals · Cook's D, DFFITS, DFBETAS · VIF & κ 07Model SelectionAdjusted R² · Mallows' Cp · AIC · BIC · PRESS · stepwise · parsimony

Assessment

How STAT7038 is assessed

Component	Weight	Format
Final examination	70%	MCQ + short calculation + short written · covers Weeks 1–12 · open to one A4 double-sided typed/printed sheet; calculator, R outputs & statistical tables supplied
Assignment	15%	R-based, non-redeemable — submitted across the semester
In-tutorial quiz	10%	Redeemable (the exam mark replaces it if higher) — on simple linear regression
Online quiz	5%	Redeemable, on Canvas — confirm the exact split & dates in your subject guide

Worked example · free

The slope t-test & 95% CI — the signature SLR question, mark by mark

Q [6 marks]. A fitted regression of exam score on study hours (n = 8) gives slope b₁ = 3.833, MSE = 0.806 and S_xx = 42. Test whether the slope differs from zero at the 5% level, and give a 95% confidence interval for it.

+1Standard error of the slope. se(b₁) = √(MSE / S_xx) = √(0.806 / 42) = 0.139.
+1Hypotheses. H₀: β₁ = 0 (no linear relationship) vs H_a: β₁ ≠ 0.
+1Test statistic. t = b₁ / se(b₁) = 3.833 / 0.139 = 27.68, compared to t_n−2 = t₆.
+1Critical value & decision. t₆(0.975) = 2.447 (supplied table); |27.68| > 2.447 ⇒ reject H₀.
+195% CI for β₁. 3.833 ± 2.447 × 0.139 = 3.833 ± 0.339 = (3.49, 4.17).
+1Conclude in context. The slope is significantly non-zero; we are 95% confident each extra study hour adds between 3.49 and 4.17 marks.

t = 27.68 on 6 df exceeds the critical 2.447, so reject H₀: the slope is significantly non-zero. A 95% CI for β₁ is 3.833 ± 2.447(0.139) = (3.49, 4.17) marks per study hour.

Sia tip — Every test earns its marks for showing all five steps — hypotheses, statistic, critical value (with df), decision, conclusion in context. The df is n − 2 in simple regression; in the exam you read se off the R summary() and the critical value off the supplied t table.

Glossary

Key terms

Least-squares estimator: The coefficients that minimise the total squared vertical distance from the points to the line. In simple regression b₁ = S_xy/S_xx and b₀ = ŷ − b₁x̄; under the LINE assumptions they are unbiased and, by Gauss–Markov, the best linear unbiased estimators (BLUE).
ANOVA decomposition: The exact split of total variation in y into the part the line explains and the part it leaves: SST = SSR + SSE, with degrees of freedom (n−1) = (p−1) + (n−p). It drives the F-test and R².
Confidence interval vs prediction interval: Both are centred at the fitted value ŷ_h, but a CI brackets the mean response at x_h while a PI brackets one new observation. The PI carries an extra ‘+1’ under the root (the new point’s own error), so it is always wider.
Leverage (hat value): h_ii measures how far an observation’s x-values sit from the centre of the predictor space; it is the i-th diagonal of the hat matrix H = X(X'X)⁻¹X'. High leverage (h_ii > 2p/n) is only dangerous when paired with a large residual.
Multicollinearity: Near-linear dependence among the predictors, which inflates the variances of the coefficients. Its tell-tale sign is a highly significant overall F with no individual t significant; diagnosed by the variance inflation factor, VIF = 1/(1−R²_j), flagged above 5 (serious above 10).

FAQ

STAT7038 FAQ

Is STAT7038 hard?

It is method-dense rather than memory-heavy. Because the exam supplies the R outputs and the statistical tables, the difficulty is in driving the procedure under time — picking the right interval, the right critical value and the right test, then reading and interpreting the output correctly. Master simple linear regression and the multiple-regression half is largely the same results in matrix clothing.

How is STAT7038 assessed?

The final exam is 70% of your grade; the rest is an R-based assignment (about 15%, non-redeemable), an in-tutorial quiz (about 10%, redeemable) and an online quiz (about 5%, redeemable). Confirm this year’s exact split and dates in your subject guide, as details shift between cohorts.

What is allowed in the STAT7038 exam?

The final is open to one A4 double-sided, typed or printed notes sheet, and the paper itself supplies the calculator, the R outputs and the statistical tables (t, F, normal). So you do not waste sheet space on table values or R syntax — spend it on the boxed formulas, the five-step test ritual, the CI-vs-PI rule and the R-output map. Confirm the permitted-materials rule for your sitting in the subject guide.

Do I need to run R in the STAT7038 exam?

No — you read supplied summary() and anova() printouts rather than running R yourself. The skill being tested is mapping each cell (Estimate, Std. Error, t value, Pr(>|t|), the F-statistic) to a formula and recovering hidden quantities such as MSE and n.

Is using AskSia for STAT7038 cheating?

No. AskSia is a study reference written in our own words — we host none of your lecturer’s files, and Sia teaches you the method to earn the marks; it does not complete or sit your assessments.

Study strategy

How to study for the exam

Build your one A4 sheet around the recurring chains, because every exam item is a procedure on supplied numbers. Drill four of them until they are automatic: S_xy/S_xx → b₁, b₀; SSE → MSE → se(b_j) → t → decision; SST = SSR + SSE → F, R²; and x_h → CI (mean) or PI (new obs). Show every line on the written parts — method marks are real. Keep the high-yield distinctions sharp: CI vs PI (the ‘+1’), outlier ≠ leverage ≠ influence, and sequential vs partial sums of squares. Practise reading an R printout cold, and remember the multicollinearity giveaway: a significant overall F with no significant individual t.

A+Everything unlocked

Unlocks this Bible + all 8 of your Australian National University subjects - and 1,000+ Bibles across every Australian university.

Sia - your STAT7038 tutor, unlimited, worked the way the exam marks it

The full 43-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works