PUBH5010 · Epidemiology Methods And Uses
Measures of Association and the 2×2 Table
This is the engine of the whole paper: nearly every analytic answer starts by laying out the exposure × outcome 2×2 table — cells a, b, c, d — and then choosing the right number to radiate out of it. From a cohort or RCT you can read risk in each arm, so you compute the risk ratio (RR = R₁/R₀) and the risk difference (RD = R₁ − R₀); from a case-control study, where sampling is on the outcome, you compute the odds ratio (OR = ad/bc) instead. RR and OR are ratio measures (how many times the risk; 1 = no effect), while RD is an absolute measure (extra cases per head; 0 = no effect) and feeds directly into the number needed to treat or harm, NNT = 1/RD. Two further numbers extend the picture: the attributable fraction in the exposed (the share of their risk due to the exposure) and the population-attributable fraction (the share of total disease that would disappear if the exposure were removed, which also depends on how common the exposure is). The discipline of this chapter is simple but exam-critical: let the design choose the measure, and never quote an OR where the design gave you a risk.
What this chapter covers
- 01The exposure × outcome 2×2 table and cells a, b, c, d
- 02Risk ratio (RR) and risk difference (RD) from a cohort or RCT
- 03The odds ratio (OR) from a case-control study, and when OR ≈ RR
- 04Ratio vs absolute measures, and what the null value is for each
- 05Number needed to treat / harm: NNT = 1/RD
- 06Attributable fraction in the exposed
- 07Population-attributable fraction and the role of exposure prevalence
Worked example: when the odds ratio overstates the risk ratio
- +1(a) Risk ratio. R₁ = 300/1000 = 0.30, R₀ = 200/1000 = 0.20, so RR = 0.30/0.20 = 1.5.
- +2(b) Odds ratio. Odds in exposed = 300/700, in unexposed = 200/800, so OR = (300×800)/(700×200) = 240000/140000 = 1.71.
- +1(c) Why they differ. The OR always sits further from 1 than the RR, and the gap widens as the outcome gets more common (here 20–30%). The rare-disease approximation OR ≈ RR fails when the outcome is common.
- +1Which to report. This is a cohort, so risk is directly observable — report the RR = 1.5. Reserve the OR for designs (case-control) where risk cannot be read off.
Key terms
- Risk ratio (RR)
- The ratio of the outcome risk in the exposed to that in the unexposed, R₁/R₀. A ratio measure with null value 1: RR = 2 means twice the risk, RR < 1 means protection. It is computed from cohort or RCT data, where risk is directly observable.
- Risk difference (RD)
- The absolute difference in risk between exposed and unexposed, R₁ − R₀. Null value 0; it expresses the extra (or fewer) cases per head attributable to the exposure and is the basis of NNT = 1/RD. It conveys public-health impact in a way a ratio cannot.
- Odds ratio (OR)
- The ratio of the odds of the outcome (or exposure) between groups, ad/bc on the 2×2. It is the measure a case-control study licenses; it approximates the RR only when the outcome is rare, and otherwise sits further from 1 than the RR.
- Number needed to treat / harm (NNT/NNH)
- The reciprocal of the risk difference, 1/RD, rounded up. It is the number of people you must treat (or expose) to cause one extra good (or bad) outcome — a clinically intuitive way to express absolute effect. A small NNT means a large absolute effect.
- Population-attributable fraction (PAF)
- The proportion of disease in the whole population that would be prevented if the exposure were eliminated. It depends on both the strength of the association and how common the exposure is, so a weak risk factor that is very widespread can have a large PAF.
Measures of Association and the 2×2 Table FAQ
How do I know whether to use RR or OR?
Let the design decide. A cohort or RCT lets you observe the risk of the outcome in each exposure group, so you can and should compute the risk ratio (and risk difference). A case-control study samples on the outcome, so risk is not observable and you must use the odds ratio. Reporting an OR from cohort data is a common slip the exam tests for.
Why does the odds ratio overstate the risk ratio?
The OR always sits further from the null (1) than the RR, and the gap grows as the outcome becomes more common. When the outcome is rare (say under ~10%), the odds and the risk are close, so OR ≈ RR — the rare-disease assumption. For common outcomes that approximation breaks down and the OR can badly exaggerate the effect.
What's the difference between a ratio measure and an absolute measure?
A ratio measure (RR, OR) says how many times more likely the outcome is — it is about relative strength and has a null of 1. An absolute measure (RD) says how many extra cases per head — it is about public-health impact and has a null of 0. A large RR on a rare outcome can still mean a tiny RD; the exam wants both the relative and the absolute reading.
What is the number needed to treat?
NNT = 1/RD (rounded up). If a treatment cuts risk by an absolute 5 percentage points (RD = 0.05), you must treat 1/0.05 = 20 people to prevent one extra bad outcome. For a harmful exposure the same arithmetic gives the number needed to harm. A small NNT signals a large absolute benefit; a large NNT, a marginal one.
Exam move
Make the 2×2 your first stroke on every association question: lay out a, b, c, d, label the design, then pick the measure it licenses (cohort/RCT → RR & RD, case-control → OR). Compute both a ratio and an absolute measure when you can, because examiners reward the contrast. Know the null values cold (RR/OR = 1, RD = 0), remember the OR exaggerates when the outcome is common, and practise NNT = 1/RD and the attributable fractions. Above all, never report an odds ratio where the design handed you a risk.