University of Melbourne · S1 2026 · FACULTY OF SCIENCE

MAST90139 · Statistical Modelling For Data Science

- one subject, every graph, every model, every mark

50% final exam · hurdle14 Chapters5-page Bible

Our own words - no uploaded lecturer files

Built to mirror S1 2026 · updated this semester

Chapter 2 of 8 · MAST90139

Generalised Linear Models

This is the spine of the whole course. A generalised linear model has three parts: a random component (the response comes from an exponential-family distribution), a systematic component (a linear predictor η = Xβ), and a link function g that joins them, g(μ) = η. Choose a distribution and a link and you have a named model: normal + identity is ordinary regression, binomial + logit is logistic regression, Poisson + log is Poisson regression. The chapter builds the machinery that runs all of them — the exponential family and its mean–variance law Var(Y) = φV(μ), the canonical links, fitting by iteratively re-weighted least squares (IRLS), and the deviance as the GLM's answer to the residual sum of squares. Learn this one engine and every later model is the same template with a new distribution and link.

In this chapter

What this chapter covers

01The three components: random, systematic, link
02The exponential family of distributions
03Canonical vs non-canonical link functions
04The mean–variance relationship Var(Y) = φV(μ)
05Estimation by iteratively re-weighted least squares (IRLS)
06Deviance and the saturated model
07Scaled deviance and the dispersion parameter φ
08Pearson vs deviance residuals

Worked example · free

Worked example: name the GLM and its canonical link

Q [6 marks]. For each response below, name the natural GLM (distribution + canonical link) and write the link equation. (a) Whether a loan defaults (yes/no). (b) The number of insurance claims on a policy in a year. (c) A continuous, roughly normal measurement (blood pressure). State the mean–variance law in each case.

+1(a) Binary default: response is Bernoulli/binomial. Canonical link = logit: log(π/(1−π)) = Xβ. This is logistic regression.
+1(a) Mean–variance: Var(Y) = π(1−π) — the variance is fixed by the mean π, with φ = 1.
+1(b) Claim count: response is a count → Poisson. Canonical link = log: log(μ) = Xβ. This is Poisson regression.
+1(b) Mean–variance: Var(Y) = μ — mean equals variance, φ = 1 (overdispersion if it exceeds μ).
+1(c) Continuous normal: response is normal. Canonical link = identity: μ = Xβ — the ordinary linear model.
+1(c) Mean–variance: Var(Y) = σ², constant — here the dispersion φ = σ² is a free parameter, not pinned to 1.

(a) binomial + logit (logistic), Var = π(1−π); (b) Poisson + log, Var = μ; (c) normal + identity (ordinary regression), Var = σ². The same three-part template — random component, linear predictor, link — produces all three.

Sia tip — The exam's first reflex is always Name: response type → distribution → canonical link. Get that automatic and the rest of any GLM question (deviance test, coefficient interpretation) follows a fixed script.

Glossary

Key terms

Random component: The first part of a GLM: the assumption that the response Y comes from an exponential-family distribution (normal, binomial, Poisson, gamma...). It determines the mean–variance relationship and so the weights used in fitting.
Link function: The function g that connects the mean to the linear predictor, g(μ) = η = Xβ. The canonical link makes the sufficient statistic linear in β (logit for binomial, log for Poisson, identity for normal); non-canonical links (e.g. probit) are allowed too.
Exponential family: The class of distributions whose density can be written exp{(yθ − b(θ))/a(φ) + c(y, φ)}. Its members share a common mean–variance structure, which is exactly what lets one fitting algorithm (IRLS) handle them all.
Deviance: D = 2(ℓ_sat − ℓ_model), twice the log-likelihood gap between the saturated model (a perfect fit) and the fitted model. It is the GLM's residual sum of squares: smaller is better, and the difference between nested models' deviances is the likelihood-ratio test.
IRLS: Iteratively re-weighted least squares — the algorithm R uses to maximise a GLM likelihood. It repeatedly solves a weighted least-squares problem with weights that depend on the current fit, converging to the maximum-likelihood β̂. It reduces to ordinary least squares in the normal-identity case.

FAQ

Generalised Linear Models FAQ

What exactly makes something a GLM?

Three ingredients: a response from an exponential-family distribution (the random component), a linear predictor η = Xβ (the systematic component), and a link function g with g(μ) = η. Fix the distribution and the link and you have named a specific model. Almost every model in MAST90139 is one choice of those two ingredients.

What is the canonical link and do I have to use it?

The canonical link is the one that makes the model's natural parameter equal to the linear predictor — logit for the binomial, log for the Poisson, identity for the normal. It has nice mathematical properties and is the default, but you are not forced to use it: the probit and complementary-log-log are valid non-canonical links for binary data, for instance.

What is the deviance and why not just use the residual sum of squares?

The deviance generalises the residual sum of squares to any GLM. Because GLMs are fitted by likelihood, the natural measure of fit is the log-likelihood gap to the saturated (perfect-fit) model, scaled by 2. For the normal model the deviance literally is the residual sum of squares; for other families it is the right likelihood-based analogue, and its differences give chi-square tests.

What does the dispersion parameter φ do?

φ scales the variance: Var(Y) = φV(μ). For the binomial and Poisson it is fixed at 1, so the variance is completely determined by the mean. For the normal and gamma it is a free parameter (σ² for the normal). When binomial or Poisson data show more spread than φ = 1 allows, that is overdispersion, handled by estimating φ in a quasi-likelihood fit.

Study strategy

Exam move

This is the chapter to over-learn — every later model is this template re-run. Memorise the three components (random, systematic, link) and be able to instantly map a response type to its distribution and canonical link: binary → binomial + logit, count → Poisson + log, continuous → normal + identity. Know the mean–variance law Var(Y) = φV(μ) for each family and which families fix φ = 1. Understand the deviance as the likelihood-based residual sum of squares, and that Δdeviance between nested models is a chi-square test. You do not need to derive IRLS, but know that it is how R fits and that it reduces to OLS in the normal case. Get this engine cold and the families chapters become pattern-matching.

Keep going — explore the course

A+Everything unlocked

Unlocks this Bible + all 72 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.

Sia - your MAST90139 tutor, unlimited, worked the way the exam marks it

The full 5-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works