University of Melbourne · S1 2026 · FACULTY OF SCIENCE

MAST90139 · Statistical Modelling For Data Science

Q: What exactly is the polr sign trap?

MASS::polr writes the model as logit(P(Y ≤ j)) = αj − βx rather than the textbook αj + βx. Because of the minus sign, the coefficient R prints has the opposite sign to what many textbooks use. If you forget this, you state the effect in the wrong direction. The safe rule: a positive polr coefficient increases the odds of falling in higher categories.

- one subject, every graph, every model, every mark

50% final exam · hurdle14 Chapters4-page Bible

Our own words - no uploaded lecturer files

Built to mirror S1 2026 · updated this semester

Chapter 7 of 8 · MAST90139

Multicategory Models

When the response has more than two categories, you extend logistic regression in one of two ways depending on whether the categories are ordered. For an unordered (nominal) response — mode of transport, brand chosen — the baseline-category logit model fits a separate logit for each category against a reference, via multinom(). For an ordered (ordinal) response — mild / moderate / severe — the proportional-odds (cumulative-logit) model exploits the ordering with a single slope and a set of cumulative intercepts, via MASS::polr, giving a cumulative odds ratio. The chapter builds both, plus the sequential (continuation-ratio) alternative, and confronts the notorious sign-convention trap in polr: its parameterisation flips the sign, so you must negate to read the slope as an ordinary log-cumulative-odds effect.

In this chapter

What this chapter covers

01The multinomial response: more than two categories
02Nominal: the baseline-category logit (multinom)
03Ordinal: the cumulative-logit / proportional-odds model (polr)
04The proportional-odds assumption and the single slope
05Cumulative odds ratios
06The polr() sign-convention trap (★ the known trap)
07Sequential / continuation-ratio models

Worked example · free

Worked example: choosing the right multicategory model and reading polr()

Q [6 marks]. For each response, name the model and the R function: (a) preferred holiday type (beach / city / nature); (b) self-rated health (poor / fair / good / excellent). (c) A polr fit prints a coefficient of 0.80 for a predictor x — what is the cumulative odds ratio, and what sign trap must you watch?

+1(a) Nominal: beach/city/nature are unordered → baseline-category logit, fitted with nnet::multinom(). It gives one logit per non-reference category.
+1(b) Ordinal: poor < fair < good < excellent are ordered → proportional-odds (cumulative-logit) model, fitted with MASS::polr(). One slope, several cumulative intercepts.
+2(c) Cumulative OR: e^0.80 ≈ 2.23 — the odds of being in a higher category multiply by ~2.2 per unit of x (under proportional odds, the same OR at every cut-point).
+2(c) Sign trap: polr parameterises the model as η = α − βx, so the printed coefficient already carries the flipped sign; read its direction carefully (a positive polr coefficient raises the odds of higher categories), and negate if you convert to the textbook α + βx form.

(a) baseline-category logit via multinom(); (b) proportional-odds (cumulative-logit) via polr(); (c) the cumulative odds ratio is e^0.80 ≈ 2.23, with the warning that polr uses the α − βx parameterisation, so you must mind the sign convention when interpreting or converting the slope.

Sia tip — Decide first on order: unordered → baseline-category logit (multinom); ordered → proportional odds (polr). Then, for polr, remember the sign convention — it is the single most common interpretation error in this chapter.

Glossary

Key terms

Multinomial response: A categorical response with more than two unordered or ordered categories. Logistic regression generalises to it via the baseline-category logit (nominal) or the cumulative-logit / proportional-odds model (ordinal).
Baseline-category logit: The model for a nominal response: pick a reference category and fit a separate logit of each other category against it, log(π_j/π_ref) = Xβ_j. Fitted in R with nnet::multinom(). Each category gets its own coefficient vector.
Proportional-odds model: The model for an ordinal response: model the cumulative logits log(P(Y ≤ j)/P(Y > j)) with category-specific intercepts but a single shared slope, so one odds ratio applies at every cut-point. Fitted with MASS::polr().
Cumulative odds ratio: e raised to a proportional-odds slope — the factor by which the odds of being in a higher (vs lower) response category multiply per one-unit rise in the predictor. Under proportional odds it is the same at every cut-point.
polr sign convention: MASS::polr parameterises the cumulative logit as η = α_j − βx, so its printed coefficient has the opposite sign to the textbook α + βx form. Misreading this flips the direction of the effect — the chapter's signature trap.

FAQ

Multicategory Models FAQ

When do I use multinom() versus polr()?

It depends entirely on whether the response categories are ordered. Unordered (nominal) responses — brand, transport mode, holiday type — use the baseline-category logit via nnet::multinom(). Ordered (ordinal) responses — severity, satisfaction, ratings — use the proportional-odds (cumulative-logit) model via MASS::polr(), which exploits the ordering with a single slope.

What is the proportional-odds assumption?

That a single slope describes the predictor's effect at every cut-point of the ordinal scale — the odds ratio for being above any given category is the same. It makes the model parsimonious (one coefficient per predictor instead of one per cut-point) but should be checked; if it fails, a more flexible model (partial proportional odds, or a sequential model) is needed.

What exactly is the polr sign trap?

MASS::polr writes the model as logit(P(Y ≤ j)) = α_j − βx rather than the textbook α_j + βx. Because of the minus sign, the coefficient R prints has the opposite sign to what many textbooks use. If you forget this, you state the effect in the wrong direction. The safe rule: a positive polr coefficient increases the odds of falling in higher categories.

What is a sequential (continuation-ratio) model?

An alternative for ordinal data that models the conditional odds of moving to the next category given you have reached the current one — natural when the categories represent stages reached in order (e.g. levels of education completed). It is one of the multicategory options alongside baseline-category and proportional-odds models.

Study strategy

Exam move

Lead with the decision: is the response ordered or not? Unordered → baseline-category logit (multinom), one coefficient vector per non-reference category; ordered → proportional-odds / cumulative-logit (polr), one slope plus cumulative intercepts giving a single cumulative odds ratio. Be able to interpret e^β as a cumulative odds ratio and to state the proportional-odds assumption (same OR at every cut-point). Above all, drill the polr sign convention — the α − βx parameterisation flips the printed coefficient's sign, and getting the direction wrong is the most common error here. Know the sequential model as the third option. The exam reads multinom()/polr() output, so practise naming the model, computing the odds ratio, and stating the effect direction correctly.

Keep going — explore the course

A+Everything unlocked

Unlocks this Bible + all 72 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.

Sia - your MAST90139 tutor, unlimited, worked the way the exam marks it

The full 4-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works