University of Melbourne · S1 2026 · FACULTY OF SCIENCE

MAST90139 · Statistical Modelling For Data Science

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters5-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 5 of 8 · MAST90139

Poisson Regression

When the response is a count — emails per hour, claims per policy, cases per region — the linear model breaks: counts are non-negative integers whose variance grows with their mean. The fix is a GLM with the Poisson random component and the log link, log(μ) = Xβ. Two skills carry almost every mark: reading eβ as a rate ratio (the multiplicative effect on the expected count), and knowing when to add an offset log(exposure) so you model a rate (events per person-year, per 1000 policies) rather than a raw count. The chapter builds the Poisson pmf and its mean = variance signature, the multiplicative log-link model, offsets, deviance and Pearson goodness-of-fit, nested-model comparison by ΔD, and the all-important overdispersion fix — quasi-Poisson, where coefficients stay fixed but standard errors inflate by √φ̂.

In this chapter

What this chapter covers

  • 01The Poisson distribution: mean = variance
  • 02The Poisson regression model and the log link
  • 03Reading eβ as a rate ratio (★ the examined skill)
  • 04Offsets: modelling rates instead of counts (★)
  • 05Deviance and Pearson X² goodness-of-fit
  • 06Comparing nested models by ΔD vs χ²
  • 07Overdispersion and the quasi-Poisson fix
Worked example · free

Worked example: rate ratio, offset and goodness-of-fit from a Poisson fit

Q [6 marks]. A Poisson regression of disease counts on a binary 'coastal' indicator, with offset = log(pop), gives β̂coastal = 0.405 (se 0.120). R reports residual deviance 58.1 on 48 df. (a) Interpret e0.405. (b) Why is the offset there? (c) Does the model fit?
μxμ = e^(β₀+β₁x)never negative; multiplicative
  • +2(a) Rate ratio: e0.4051.50. The coastal incidence rate is about 1.5 times the inland rate — per head of population, because of the offset.
  • +2(b) The offset: regions have unequal populations, so raw counts aren't comparable. offset = log(pop) pins a coefficient of 1 on log(pop), converting the model from counts to a rate per head; without it, 'coastal' would partly soak up population size.
  • +2(c) Goodness-of-fit: D/df = 58.1/48 = 1.21, and χ²0.95(48) ≈ 65.2. Since 58.1 < 65.2, there is no evidence of lack of fit and dispersion is near 1 — the Poisson assumption holds.
e0.405 ≈ 1.50 means the coastal rate is ~1.5× inland (per head, thanks to the offset); the offset log(pop) makes the model a rate model rather than a count model; and with residual deviance 58.1 on 48 df below χ²0.95(48) ≈ 65.2 the model fits well with no overdispersion.
Sia tip — The single biggest Poisson trap is forgetting the offset when exposures differ — it silently turns a rate model into a count model and changes the interpretation completely. An exposure variable must enter as an offset (coefficient fixed at 1), never as an ordinary covariate.
Glossary

Key terms

Log link
g(μ) = log(μ), the canonical link for the Poisson. Modelling log(μ) linearly keeps the fitted mean positive and makes covariate effects multiplicative on the count scale: μ = e.
Rate ratio
e raised to a Poisson coefficient, eβ — the factor by which the expected count (or rate, with an offset) multiplies for a one-unit rise in the predictor. The count analogue of the logistic odds ratio.
Offset
A term log(exposure) added to the linear predictor with its coefficient fixed at 1, so the model describes a rate (events per unit exposure) rather than a raw count. Essential when rows have different exposures (population, time, area).
Poisson deviance
D = 2Σ[y log(y/μ̂) − (y − μ̂)], the goodness-of-fit statistic that under a correct model is approximately χ²(n−q). D well above its df signals lack of fit or overdispersion.
Quasi-Poisson
A fit that keeps the log-link mean model but estimates a dispersion φ, so Var = φμ. Coefficients (and rate ratios) are identical to the Poisson fit; standard errors inflate by √φ̂, and nested comparisons use an F-test. AIC is undefined for quasi-families.
FAQ

Poisson Regression FAQ

Why use Poisson regression instead of a linear model on the counts?

Three failures of the linear model on counts: the fitted mean can go negative (impossible for a count); the variance is not constant but equals the mean, so large counts are noisier; and the integer, right-skewed shape is far from normal. The Poisson GLM fixes all three at once by modelling log(μ) linearly and letting Var = μ.

How do I interpret a Poisson coefficient?

Exponentiate it: eβ is a rate ratio — the multiplicative change in the expected count (or rate, if there is an offset) per one-unit rise in the predictor. Say it as 'the expected count multiplies by eβ.' It is the count-data twin of the logistic odds ratio.

When do I need an offset?

Whenever rows have different exposures — populations, time at risk, areas, numbers of trials — and you want to model a rate rather than a raw count. Add offset = log(exposure); its coefficient is fixed at 1, so eβ becomes a rate per unit exposure. Omitting it lets your covariates partly measure how big each unit is.

What is overdispersion in a Poisson model and how do I fix it?

Overdispersion is when the counts vary more than the mean = variance assumption allows, shown by residual deviance far above its df. Fix it with quasi-Poisson (family = quasipoisson): the coefficients and rate ratios are unchanged, but standard errors inflate by √φ̂, giving honest p-values. For severe overdispersion with a true likelihood, switch to a negative-binomial model.

Study strategy

Exam move

Make two moves automatic. First, the rate-ratio sentence: exponentiate any Poisson coefficient and state 'the expected count/rate multiplies by eβ per unit of x'. Second, the offset decision: whenever exposures differ, add log(exposure) as an offset so you model a rate — and never enter exposure as a free covariate. Know that the Poisson's signature is mean = variance, that residual deviance is a valid goodness-of-fit test for counts (compare D to df), and the overdispersion fix: quasi-Poisson leaves coefficients alone, scales standard errors by √φ̂, and switches comparisons to the F-test (with AIC undefined). The exam reads R output, so rehearse turning a printed summary() into the rate ratio, the offset justification, and the fit verdict.

A+Everything unlocked
Unlocks this Bible + all 72 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.
Sia - your MAST90139 tutor, unlimited, worked the way the exam marks it
The full 5-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full MAST90139 Bible + 72 University of Melbourne subjects解锁完整 MAST90139 Bible + University of Melbourne 72 门科目
$25/mo