PUBH5010 · Epidemiology Methods And Uses
Screening and Diagnostic Tests
Evaluating a test reduces to a second 2×2 — the diagnostic table — with a strict reading rule: columns are accuracy, rows are predictive value. Down the columns (true disease status) live sensitivity (the share of true cases the test catches) and specificity (the share of true non-cases it correctly clears); these are intrinsic properties of the test. Across the rows (test result) live the positive and negative predictive values — given a positive (or negative) result, the chance the person truly is (or isn’t) diseased — and these depend on prevalence: the same test gives a far lower PPV in a low-prevalence population, the single most-tested idea here. Screening adds a public-health layer: the Wilson & Jungner principles set out when population screening is justified (an important condition with a recognisable early stage, a suitable test, an available treatment, and benefit outweighing harm), and two biases inflate apparent benefit — lead-time bias (earlier diagnosis lengthens measured survival without postponing death) and length-time bias (screening preferentially catches slow, indolent cases). The exam asks you to compute the four quantities from a table, explain why PPV moves with prevalence, and judge a screening programme against Wilson & Jungner while spotting these biases.
What this chapter covers
- 01The diagnostic 2×2: columns = accuracy, rows = predictive value
- 02Sensitivity and specificity as intrinsic test properties
- 03Positive and negative predictive value, read across the rows
- 04Why PPV and NPV depend on prevalence
- 05The sensitivity–specificity trade-off and the cut-off
- 06Wilson & Jungner principles for justified screening
- 07Lead-time and length-time bias
Worked example: why a good test has poor PPV when disease is rare
- +1(a) Diseased column. 1% of 10,000 = 100 have the disease. Sensitivity 95% → TP = 95, FN = 5.
- +2Non-diseased column. 9,900 do not. Specificity 90% → TN = 8,910, and false positives FP = 10% × 9,900 = 990.
- +2(b) PPV. PPV = TP/(TP+FP) = 95/(95+990) = 95/1085 ≈ 8.8%.
- +1(c) Explain. Even a highly specific test produces many false positives when the non-diseased group is huge, so they swamp the few true positives. At low prevalence, a positive result is usually a false alarm — PPV is low despite excellent sensitivity and specificity.
Key terms
- Sensitivity
- The proportion of people who truly have the disease that the test correctly identifies as positive, TP/(TP+FN). Read down the diseased column of the diagnostic 2×2, it is an intrinsic property of the test and does not change with prevalence. A highly sensitive test is good for ruling disease out when negative.
- Specificity
- The proportion of people who truly do not have the disease that the test correctly identifies as negative, TN/(TN+FP). Read down the non-diseased column, it too is intrinsic to the test. A highly specific test is good for ruling disease in when positive.
- Positive predictive value (PPV)
- Given a positive test, the probability the person truly has the disease, TP/(TP+FP). Read across the test-positive row, it depends on prevalence: in a low-prevalence population even a good test yields a low PPV because false positives outnumber true ones.
- Wilson & Jungner criteria
- The classic principles for when population screening is justified: the condition is an important health problem with a recognisable latent/early stage, a suitable and acceptable test exists, an accepted treatment is available, facilities for diagnosis and treatment exist, and the benefits outweigh the costs and harms.
- Lead-time bias
- The apparent lengthening of survival caused purely by diagnosing disease earlier through screening, without actually delaying death. Survival measured from diagnosis looks longer even if the person dies at the same time, falsely crediting screening with benefit.
Screening and Diagnostic Tests FAQ
What's the difference between sensitivity/specificity and predictive values?
Sensitivity and specificity are read down the columns (true disease status) and are intrinsic properties of the test — they do not change with how common the disease is. Predictive values (PPV, NPV) are read across the rows (test result) and answer the clinically useful question 'given this result, how likely is the disease?' — and they depend heavily on prevalence. The columns-vs-rows rule keeps the four quantities straight.
Why does PPV fall when disease is rare?
Because the pool of non-diseased people is large, even a small false-positive rate generates many false positives, which can swamp the relatively few true positives. So among everyone who tests positive, a large share are actually disease-free, dragging the PPV down. Sensitivity and specificity stay the same — only the predictive values move with prevalence. This is why screening low-risk populations produces so many false alarms.
When is population screening justified?
Use the Wilson & Jungner principles: the condition should be an important problem with a recognisable early or latent stage; a suitable, acceptable, sufficiently accurate test must exist; an effective accepted treatment and the facilities to deliver it must be available; early treatment must improve outcomes; and the overall benefits must outweigh the financial and human costs (including false positives and overdiagnosis). Failing several of these argues against screening.
What are lead-time and length-time bias?
Both make screening look better than it is. Lead-time bias is the artificial lengthening of survival measured from diagnosis simply because screening finds disease earlier, without postponing death. Length-time bias is screening's tendency to preferentially catch slow-growing, indolent cases (which spend longer in a detectable pre-clinical phase) over aggressive fast ones, so screen-detected cases look to have better outcomes regardless of treatment. Mortality-based, randomised evaluation guards against both.
Exam move
Anchor everything on the reading rule: columns are accuracy (sensitivity, specificity — intrinsic, prevalence-independent); rows are predictive value (PPV, NPV — prevalence-dependent). Practise building the diagnostic 2×2 from a prevalence, sensitivity and specificity and computing all four numbers, and be ready to explain in words why PPV drops at low prevalence (false positives swamp true ones) — the single most-tested idea. For the public-health side, memorise the Wilson & Jungner checklist and the two screening biases (lead-time, length-time) so you can judge and critique a screening programme.