University of Melbourne · S1 2026 · FACULTY OF SCIENCE

MAST90139 · Statistical Modelling For Data Science

- one subject, every graph, every model, every mark
50% final exam · hurdle8 Chapters44-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
The Complete Exam Bible · S1 2026

Statistical Modelling for Data Science

— one engine, every response type, every mark

Statistical Modelling for Data Science teaches one engine — the generalised linear model (GLM) — and then watches it bend to every response type: binary, grouped-binomial, count, contingency-table and multicategory (nominal and ordinal) data, all fitted in R. The exam is a read-the-R-output gauntlet: given a glm() / multinom() / polr() printout you must name the model and link, do the deviance / ΔD-vs-χ² arithmetic by hand, and translate a coefficient into an odds / rate / cumulative-odds statement. This guide teaches each family to that standard, so you can read any printout and say exactly what it means.

MAST90139 · University of Melbourne
Assessment

How MAST90139 is assessed

ComponentWeightFormat
Assignment 1 — logistic regressionWeight: confirmBinary / grouped logistic, odds-ratio interpretation · submitted to Gradescope (due ~early April)
Assignment 2 — binomial dose-responseWeight: confirmBeetle-mortality dose-response, link comparison · Gradescope (due ~early May)
Assignment 3 — ordinal / multinomialWeight: confirm3-category ordinal response with polr · Gradescope (due ~end May)
Final examinationWeight: confirmRead-the-R-output across the whole GLM arc — weight, length and book-status are not in the supplied source; confirm the exact split and format in your subject guide
Worked example · free

Reading a glm() printout — logistic coefficient to odds ratio, mark by mark

Q [6 marks]. A logistic regression of an outcome Y (1 = event) on a continuous predictor x returns the fitted model logit(π) = −2.0 + 0.80·x. (a) Interpret the slope 0.80 as an odds ratio. (b) Compute the fitted probability at x = 2.5. (c) The null deviance is 180 on 99 df and the residual deviance is 150 on 98 df — test whether x matters.
πx1.5x=2.5π = 1/(1+e^−η)logit(π) = −2.0 + 0.80x
  • +2(a) Slope → odds ratio: in logistic regression the coefficient is on the log-odds scale, so eβ is the odds ratio: e0.802.23. Each one-unit rise in x multiplies the odds of the event by about 2.2.
  • +1(b) Linear predictor at x = 2.5: η = −2.0 + 0.80×2.5 = 0.
  • +1(b) Inverse-logit: π = 1 / (1 + e−η) = 1 / (1 + e0) = 0.5 — the curve crosses 0.5 exactly where η = 0.
  • +1(c) Likelihood-ratio test: ΔD = Dnull − Dresid = 180 − 150 = 30, on 99 − 98 = 1 df.
  • +1(c) Compare to χ²: 30 ≫ χ²1, 0.05 = 3.84, so reject H₀ — x is a highly significant predictor.
e0.80 ≈ 2.23 (the odds roughly double per unit of x); the fitted probability at x = 2.5 is 0.5 (where η = 0); and the drop in deviance ΔD = 30 on 1 df vastly exceeds 3.84, so x is a strongly significant predictor.
Sia tip — The exam almost always wants the coefficient turned into a sentence: “the odds multiply by eβ” (logit), “the rate multiplies by eβ” (log/Poisson). State the multiplier, the direction, and what it acts on (odds vs rate). A coefficient with no interpretation rarely earns full marks.
Glossary

Key terms

Generalised linear model (GLM)
A model with three parts: a random component (a response from an exponential-family distribution), a linear predictor η = Xβ, and a link function g that connects them, g(μ) = η. Linear, logistic, Poisson and log-linear regression are all special cases.
Link function
The function g that maps the mean μ onto the linear-predictor scale, g(μ) = η = Xβ. The logit link gives logistic regression (binary data), the log link gives Poisson regression (counts); each “canonical” link pairs with a distribution.
Deviance
The GLM analogue of the residual sum of squares: twice the log-likelihood gap between the saturated model and your fitted model. Smaller deviance means better fit; the drop in deviance between nested models is the likelihood-ratio test statistic.
Odds ratio
e raised to a logistic coefficient, eβ — the factor by which the odds of the event multiply for a one-unit rise in the predictor. Greater than 1 raises the odds, less than 1 lowers them; it is the headline output of any logistic fit.
Overdispersion
When count or grouped-binomial data vary more than the Poisson or binomial model allows (residual deviance far exceeds its degrees of freedom). It is fixed by estimating a dispersion parameter φ and refitting a quasi-likelihood model, then testing with an F test rather than χ².
FAQ

MAST90139 FAQ

Is MAST90139 hard?

It is procedural once the GLM framework clicks, but it is dense. The trick is realising that logistic, Poisson, log-linear and ordinal regression are the same three-part model with a different distribution and link — learn that engine cold and the rest is pattern-matching. The difficulty is reading R output fast and interpreting coefficients correctly under exam time.

How is MAST90139 assessed?

The supplied source confirms three written assignments submitted to Gradescope (logistic, binomial dose-response, and an ordinal/multinomial study) plus a final examination. The exact weights, exam length and whether the exam is open- or closed-book are not stated in the supplied material — confirm them in the official subject guide / Handbook, as they shift between cohorts.

What is on the MAST90139 final exam?

A read-the-R-output gauntlet across the whole GLM arc: given a glm(), multinom() or polr() printout, name the model and link, do the deviance / ΔD-vs-χ² arithmetic by hand, and translate a coefficient into an odds ratio (logit), a rate ratio (log) or a cumulative odds ratio (proportional odds).

What maths and software do I need for MAST90139?

You need matrix algebra, likelihood and the normal/exponential-family distributions, plus comfort reading regression output. All fitting is done in R (the faraway package, with glm, polr and multinom); you are not asked to invert matrices by hand, but you are asked to read and interpret what R prints.

Is using AskSia for MAST90139 cheating?

No. AskSia is a study reference written in our own words — we host none of your lecturer's files, and Sia teaches you the method to read a printout and earn the marks; it does not complete or sit your assessments.

Study strategy

How to study for the exam

Build everything on Chapter 2, the GLM framework — random component, linear predictor and link — because every later model is that template with a new distribution and link. Then drill the three recurring exam moves until they are automatic: Name (response type → distribution → canonical link), Test (ΔD = D₀ − D₁ against χ² with the df difference for nested models; D against χ²(n−q) for grouped goodness-of-fit), and Interpret (eβ as an odds ratio, rate ratio, cross-product ratio or cumulative OR). Practise on real R printouts so you can do the deviance arithmetic and the coefficient sentence at speed — that triple is the whole paper.

A+Everything unlocked
Unlocks this Bible + all 72 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.
Sia - your MAST90139 tutor, unlimited, worked the way the exam marks it
The full 44-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full MAST90139 Bible + 72 University of Melbourne subjects解锁完整 MAST90139 Bible + University of Melbourne 72 门科目
$25/mo