University of Sydney · S1 2026 · FACULTY OF HEALTH & MEDICINE

PUBH5010 · Epidemiology Methods And Uses

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters2-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 4 of 10 · PUBH5010

Selection Bias

Selection bias arises when the people you actually analysed are not a fair window on the study base — the source population the question is really about — so the association you measure is distorted before any calculation. It enters through how people get into, or stay in, the study: choosing controls who differ systematically from the population that produced the cases; the healthy-worker effect, where employed people are healthier than the general population used as a comparison; loss to follow-up that is related to both exposure and outcome; volunteer and referral effects; and survivor effects in cross-sectional or prevalence data. The damage is specific, not vague: depending on the pattern, selection bias can push the measure of association toward or away from the null, or even reverse it, and — unlike confounding — you usually cannot fix it in the analysis, because the information you would need was never sampled. The exam test is to name where in the selection process the distortion entered and argue which way it likely moved the estimate, then say what design or sampling choice would have prevented it.

In this chapter

What this chapter covers

  • 01What selection bias is: the analysed sample vs the study base
  • 02Selection of inappropriate controls in case-control studies
  • 03The healthy-worker effect
  • 04Loss to follow-up related to exposure and outcome
  • 05Volunteer, referral and survivor effects
  • 06Predicting the direction of the distortion
  • 07Why selection bias usually can't be fixed in the analysis
Worked example · free

Worked example: differential loss to follow-up in a cohort

Q [4 marks]. A cohort study of a workplace exposure finds RR = 1.0 (no association). On audit, exposed workers who became unwell were far more likely to leave the job and drop out of follow-up than unwell unexposed workers. (a) Name the bias. (b) State the likely direction of distortion. (c) Say whether it can be fixed in the analysis.
exposed + unwell → drop outlost ✗unexposed + unwell → staycounted ✓cases lost from exposed arm onlyobserved RR pulled toward (or below) 1
  • +1(a) Name it. Outcome-related dropout that differs by exposure is selection bias from differential loss to follow-up.
  • +2(b) Direction. Cases are selectively removed from the exposed arm, so the exposed group looks healthier than it is. The true RR is understated — the measure is pulled toward the null (here, masking a real effect as RR = 1.0).
  • +1(c) Fixable? No. The outcomes of the workers who left were never observed, so no adjustment can recover them. The remedy is prevention by design — minimise and track dropout, and compare characteristics of those lost vs retained.
Differential loss to follow-up (a selection bias); it biases the RR toward the null by removing cases from the exposed arm; it cannot be fixed analytically because the missing outcomes were never recorded.
Sia tip — For any selection bias, answer in three beats: where in selection it entered, which way it moved the estimate, and why analysis usually can't undo it. The 'can't fix it' point is what separates selection bias from confounding.
Glossary

Key terms

Selection bias
A systematic error arising when the people included in (or retained by) a study differ from the study base in a way related to both exposure and outcome, distorting the measure of association. Unlike confounding, it usually cannot be corrected in the analysis.
Study base
The source population and time window that generates the cases. Selection bias is fundamentally a mismatch between the analysed sample and this base — most often a control group that does not represent the population the cases came from.
Healthy-worker effect
A selection bias in occupational studies: employed people are on average healthier than the general population, so using the general population as the comparison understates an occupational hazard. The fix is an internal comparison group within the workforce.
Loss to follow-up
Participants leaving a cohort before the outcome is observed. It biases the result only when the loss is related to both exposure and outcome (differential loss); random loss reduces precision but not validity.
Survivor / prevalence bias
In cross-sectional or prevalence-based studies, only those who survived with the disease long enough to be sampled are captured, so determinants of survival get confused with determinants of disease.
FAQ

Selection Bias FAQ

How is selection bias different from confounding?

Selection bias comes from how people entered or stayed in the study — the analysed sample is not a fair window on the study base. Confounding comes from a third variable mixed into a correctly-sampled comparison. The practical difference is fixability: confounding can often be adjusted for in the analysis if you measured the confounder; selection bias usually cannot, because the needed information was never sampled.

Which way does selection bias push the estimate?

It depends on the pattern — that is the whole exam skill. It can bias toward the null (e.g. cases selectively lost from the exposed arm masking a real effect), away from the null (e.g. controls chosen to be unusually unexposed inflating an OR), or even reverse the direction. You argue the direction from who was selectively included or excluded and how that group sits on exposure and outcome.

What is the healthy-worker effect?

Working people are healthier than the general population because illness keeps people out of work, so comparing a workforce to the general population understates occupational risk — the exposed group is artificially healthy. The standard remedy is an internal comparison: contrast more-exposed with less-exposed workers, not workers with the public.

Can you adjust for selection bias afterwards?

Generally no. The information you would need — the outcomes of people who were never sampled or who dropped out — simply was not collected, so there is nothing to adjust. The defence is in the design: representative control selection, internal comparison groups, and minimising and characterising loss to follow-up. This non-fixability is exactly why examiners stress getting selection right up front.

Study strategy

Exam move

Answer every selection-bias prompt in three beats: (1) where in the selection or retention process the distortion entered, (2) which direction it likely moved the estimate (toward, away from, or across the null), and (3) why the analysis usually cannot undo it. Memorise the named patterns — inappropriate controls, healthy-worker effect, differential loss to follow-up, volunteer/referral/survivor effects — and for each, the design choice that prevents it. The contrast with confounding (fixable vs not) is a favourite discriminating question.

A+Everything unlocked
Unlocks this Bible + all 11 of your University of Sydney subjects - and 1,000+ Bibles across every Australian university.
Sia - your PUBH5010 tutor, unlimited, worked the way the exam marks it
The full 2-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full PUBH5010 Bible + 11 University of Sydney subjects解锁完整 PUBH5010 Bible + University of Sydney 11 门科目
$25/mo