University of Melbourne · S1 2026 · FACULTY OF SCIENCE

MAST90139 · Statistical Modelling For Data Science

Q: Why does MAST90139 start with the linear model if the course is about GLMs?

Because every GLM is the linear model with two changes: a non-normal random component and a link function on the mean. Reviewing the linear model fixes the matrix vocabulary (design matrix, hat matrix, residuals) and the LINE assumptions, so that when those assumptions fail for binary or count data you can see exactly which piece the GLM replaces and why.

Q: Is OLS really the same as maximum likelihood?

Yes — but only under the linear model's assumptions. When errors are normal with constant variance, minimising the residual sum of squares is identical to maximising the likelihood, so OLS = MLE. This coincidence is special to the normal case; in a GLM there is no closed-form least-squares solution and you fit by iteratively re-weighted least squares (IRLS) instead.

Q: What are the LINE assumptions and which one breaks first?

Linearity, Independence, Normality, Equal variance. For binary and count responses the last two fail immediately: a Bernoulli or Poisson variance depends on the mean (not constant), and the response is not normal. That is precisely the failure a GLM repairs — it lets the variance follow the mean and models a function of the mean instead of the mean itself.

Q: Do I have to invert matrices by hand in the exam?

No. You read coefficients, standard errors and the t / F tests off the R summary(). The matrix form matters conceptually — it is the notation every later chapter reuses — but the arithmetic the exam asks for is interpreting output, not inverting XTX.

- one subject, every graph, every model, every mark

50% final exam · hurdle14 Chapters3-page Bible

Our own words - no uploaded lecturer files

Built to mirror S1 2026 · updated this semester

Chapter 1 of 8 · MAST90139

Linear Models

The normal linear model is where every generalised linear model starts — and the case they all generalise. You write the response in matrix form y = Xβ + ε, estimate β by ordinary least squares (which here coincides with maximum likelihood), and read the t and F tests straight off the R output. MAST90139 reviews it not for its own sake but as the springboard: it fixes the vocabulary (design matrix, fitted values, residuals, the hat matrix) and the four LINE assumptions — Linearity, Independence, Normality, Equal variance — precisely so you can see where they break and why binary, count and categorical responses force a GLM. Master the matrix algebra and the diagnostics here, and every later chapter is the same machinery with a new distribution and link.

In this chapter

What this chapter covers

011.1 The normal linear model in matrix form (y = Xβ + ε)
02The design matrix, fitted values and residuals
031.2 OLS estimation — and why OLS = MLE under normality
04Sampling distribution of β̂ · the t and F tests
051.3 The four LINE assumptions
06Residual diagnostics — and where LINE breaks
07Why binary / count data force a GLM

Worked example · free

Worked example: reading a linear-model summary() and testing a coefficient

Q [5 marks]. A linear model y = β₀ + β₁x + ε is fitted to n = 40 observations. R reports β̂₁ = 2.50 with standard error 0.50, residual standard error s on 38 df, and you are told x is the only predictor. (a) Test H₀: β₁ = 0 with a t-test. (b) Give a 95% confidence interval for β₁. (c) Say in one sentence what makes this an ordinary linear model rather than a GLM.

+1(a) t-statistic: t = β̂₁ / se(β̂₁) = 2.50 / 0.50 = 5.0 on 38 df.
+1(a) Conclude: |t| = 5.0 ≫ t_{38, 0.025} ≈ 2.02, so reject H₀ — x is a significant predictor (p < 0.001).
+1(b) 95% CI: β̂₁ ± t_38,0.025·se = 2.50 ± 2.02×0.50 = 2.50 ± 1.01 = (1.49, 3.51).
+1(b) Read it: the interval excludes 0, consistent with the significant t-test.
+1(c) Why ordinary: the response is continuous with constant-variance normal errors and the mean is modelled directly (identity link), so OLS = MLE; a GLM is needed only when the response is binary/count/categorical and the variance depends on the mean.

t = 5.0 on 38 df rejects H₀ (x is significant); the 95% CI is (1.49, 3.51), which excludes 0; and it is an ordinary linear model because the response is normal with constant variance modelled through the identity link — the one case where OLS and maximum likelihood agree.

Sia tip — In the linear model, the t-test on a coefficient and the F-test on the whole model both come straight off summary(); the same logic reappears in every GLM, but with z / Wald and the deviance replacing t and the residual sum of squares.

Glossary

Key terms

Design matrix (X): The n × p matrix of predictor values (with a leading column of 1s for the intercept) that turns the model into the compact form y = Xβ + ε. Its columns are the covariates; β is the vector of coefficients.
Ordinary least squares (OLS): The estimator β̂ = (X^TX)⁻¹X^Ty that minimises the residual sum of squares. Under normal, constant-variance errors it equals the maximum-likelihood estimator — the property that makes the linear model the easy special case of a GLM.
Hat matrix: H = X(X^TX)⁻¹X^T, the projection that maps y onto the fitted values ŷ = Hy. Its diagonal entries are the leverages, which measure how much each observation can pull its own fitted value.
LINE assumptions: The four conditions of the linear model: Linearity of the mean, Independence of errors, Normality of errors, and Equal (constant) variance. Diagnostics check each; their failure for binary or count data is what motivates the GLM.
Identity link: The trivial link g(μ) = μ that models the mean directly. The normal linear model is the GLM with a normal random component and the identity link — which is why no transformation of the mean is needed.

FAQ

Linear Models FAQ

Why does MAST90139 start with the linear model if the course is about GLMs?

Because every GLM is the linear model with two changes: a non-normal random component and a link function on the mean. Reviewing the linear model fixes the matrix vocabulary (design matrix, hat matrix, residuals) and the LINE assumptions, so that when those assumptions fail for binary or count data you can see exactly which piece the GLM replaces and why.

Is OLS really the same as maximum likelihood?

Yes — but only under the linear model's assumptions. When errors are normal with constant variance, minimising the residual sum of squares is identical to maximising the likelihood, so OLS = MLE. This coincidence is special to the normal case; in a GLM there is no closed-form least-squares solution and you fit by iteratively re-weighted least squares (IRLS) instead.

What are the LINE assumptions and which one breaks first?

Linearity, Independence, Normality, Equal variance. For binary and count responses the last two fail immediately: a Bernoulli or Poisson variance depends on the mean (not constant), and the response is not normal. That is precisely the failure a GLM repairs — it lets the variance follow the mean and models a function of the mean instead of the mean itself.

Do I have to invert matrices by hand in the exam?

No. You read coefficients, standard errors and the t / F tests off the R summary(). The matrix form matters conceptually — it is the notation every later chapter reuses — but the arithmetic the exam asks for is interpreting output, not inverting X^TX.

Study strategy

Exam move

Treat this chapter as vocabulary and diagnostics, not as new statistics. Learn the matrix form y = Xβ + ε cold, because the design matrix, fitted values, residuals and hat matrix reappear in every GLM. Be able to read a summary(): the t-test on each coefficient, the F-test on the model, and what the residual standard error is. Above all, internalise the four LINE assumptions and the residual plots that check them — the whole rest of the course is the story of what to do when Normality and Equal-variance fail, so knowing exactly where the linear model breaks is what lets you choose the right GLM later.

Keep going — explore the course

A+Everything unlocked

Unlocks this Bible + all 72 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.

Sia - your MAST90139 tutor, unlimited, worked the way the exam marks it

The full 3-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works