University of Melbourne · S1 2026 · FACULTY OF SCIENCE

MAST90105 · Methods Of Mathematical Statistics

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters2-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 10 of 10 · MAST90105

Distribution-Free and Categorical Methods

The final teaching chapter handles two situations where the earlier normal-theory machinery does not directly apply. Distribution-free (non-parametric) methods make inferences without assuming a particular parametric form — the sign test and rank-based tests use only the ordering of the data, trading some power for robustness when normality is doubtful. Categorical methods handle data that arrive as counts in cells, and here every test has the same shape: the chi-square statistic χ² = ∑(O − E)²/E, comparing observed counts O with the counts expected E under a hypothesis. Two standard uses appear — the goodness-of-fit test (do the data match a claimed distribution?) and the test of independence in a contingency table (are the row and column variables associated?), with expected counts E = (row total × column total)/grand total and degrees of freedom (r−1)(c−1). The chapter stresses the assumptions (expected counts large enough) and the correct degrees-of-freedom bookkeeping, the two places marks are routinely lost.

In this chapter

What this chapter covers

  • 0110.1 When to go distribution-free
  • 0210.2 The sign test and rank-based tests
  • 0310.3 The χ² statistic: ∑(O − E)²/E
  • 0410.4 Goodness-of-fit to a claimed distribution
  • 0510.5 Contingency tables and the test of independence
  • 0610.6 Expected counts, degrees of freedom and assumptions
Worked example · free

Worked example: a χ² test of independence in a 2×2 table

Q [6 marks]. Of 200 people, the counts by (treatment, outcome) are: Treated&Improved = 60, Treated&Not = 40, Control&Improved = 30, Control&Not = 70. Test at the 5% level whether improvement is independent of treatment. (χ²0.05,1 = 3.841.)
  • +1Hypotheses. H0: outcome is independent of treatment vs Ha: they are associated. Test statistic: χ² = ∑(O − E)²/E.
  • +1Margins. Row totals 100 and 100; column totals Improved = 90, Not = 110; grand total 200.
  • +1Expected counts. E = (row×column)/grand total: each ‘Improved’ cell expects 100·90/200 = 45; each ‘Not’ cell expects 100·110/200 = 55.
  • +1Chi-square. χ² = (60−45)²/45 + (40−55)²/55 + (30−45)²/45 + (70−55)²/55 = 5 + 4.09 + 5 + 4.09 = 18.18.
  • +1Degrees of freedom and critical value. df = (2−1)(2−1) = 1; reject if χ² > χ²0.05,1 = 3.841.
  • +1Decide and interpret. 18.18 > 3.841, so reject H0: improvement and treatment are significantly associated — the treated group improves more than independence would predict.
χ² = 18.18 on 1 df exceeds 3.841, so we reject independence at 5%: outcome depends on treatment. The recipe is always the same — expected counts from the margins, sum (O−E)²/E, compare to the χ² quantile at (r−1)(c−1) df.
Glossary

Key terms

Distribution-free (non-parametric) test
A test that does not assume a specific parametric family for the data, typically using ranks or signs — e.g. the sign test for a median. More robust when normality is doubtful, at the cost of some power when it does hold.
Chi-square statistic
χ² = ∑(O − E)²/E, summing the squared, expectation-scaled gaps between observed counts O and counts E expected under H0. Large values signal a poor fit; it is compared to a χ² quantile with the appropriate degrees of freedom.
Goodness-of-fit test
A χ² test of whether observed category counts match a claimed distribution. Expected counts come from the hypothesised probabilities times the sample size; degrees of freedom equal the number of categories minus one (minus any parameters estimated).
Test of independence
A χ² test in an r×c contingency table of whether the row and column variables are associated. Expected counts are (row total × column total)/grand total; degrees of freedom are (r−1)(c−1).
Expected count
The count a cell would hold under the null, E = (row total × column total)/grand total for an independence test. The χ² approximation requires these to be reasonably large (a common rule of thumb is at least 5).
FAQ

Distribution-Free and Categorical Methods FAQ

When should I use a distribution-free test?

When the parametric assumptions behind a t or z test are doubtful — small samples with clear non-normality, heavy skew, or ordinal data where only ranks are meaningful. Sign and rank tests then give valid inferences using only the ordering of the observations. The trade-off is some loss of power when the data actually are normal, so prefer the parametric test when its assumptions hold.

How do I get the degrees of freedom for a chi-square test?

For a goodness-of-fit test, df = (number of categories − 1), reduced by one further for each parameter you estimate from the data. For a test of independence in an r×c table, df = (r − 1)(c − 1). Getting the degrees of freedom wrong is the most common error here, because the critical value — and therefore the decision — depends on it.

What assumption does the chi-square test rely on?

That the expected cell counts are large enough for the χ² approximation to the sampling distribution of ∑(O−E)²/E to hold — a common rule of thumb is every expected count at least about 5. With sparse tables you may need to combine categories or use an exact test. Note the rule is about expected counts, not the observed ones.

Study strategy

Exam move

For categorical data, drill the single χ² recipe until it is automatic: compute expected counts from the margins, sum (O−E)²/E, get the degrees of freedom right — (number of categories −1) for goodness-of-fit, (r−1)(c−1) for independence — and compare to the table quantile. The two reliable mark-losers are mis-counting the degrees of freedom and forgetting the expected-count assumption, so make both deliberate steps. For the distribution-free tests, know when to abandon the normal-theory tools and how the sign and rank tests use only the ordering, so a robustness question resolves cleanly.

A+Everything unlocked
Unlocks this Bible + all 72 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.
Sia - your MAST90105 tutor, unlimited, worked the way the exam marks it
The full 2-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full MAST90105 Bible + 72 University of Melbourne subjects解锁完整 MAST90105 Bible + University of Melbourne 72 门科目
$25/mo