Core Curriculum
Module 6: Screening & Diagnostic Testing
Core Topics
Validity, reliability, sensitivity, specificity, predictive values, ROC curves, and screening biases — the statistical foundation for selecting and interpreting diagnostic tests.
📋 6.1 Properties of Screening Tests
Validity (Accuracy)
Degree to which test measures what it is intended to measure. Assessed by sensitivity and specificity against a gold standard.
Reliability (Precision)
Consistency on repeated measurements. Kappa (κ) for categorical: <0.4 poor, 0.4‑0.6 moderate, 0.6‑0.8 substantial, >0.8 excellent. ICC for continuous data.
Screening vs. Diagnostic Tests: Screening → asymptomatic populations, high sensitivity (rule out). Diagnostic → symptomatic, balance sensitivity/specificity to confirm disease.
📌 Wilson‑Jungner Criteria for Screening Programs: Important health problem; recognizable early stage; suitable, acceptable test; effective treatment; cost‑effective.
✅ Appropriate: cervical (Pap), colorectal (colonoscopy/FIT), breast (mammography). ⚠️ Controversial: PSA (prostate) due to overdiagnosis.
🎯 6.2 Sensitivity and Specificity
SnNOut: High sensitivity → negative test rules out disease.
SpPIn: High specificity → positive test rules in disease.
Cutoff threshold: Lower threshold ↑ sensitivity ↓ specificity; higher threshold ↓ sensitivity ↑ specificity.
📊 Example: Troponin for MI — low cutoff for screening (high sensitivity), high cutoff for confirmation (high specificity).
📊 6.3 Predictive Values
Positive Predictive Value (PPV) = a/(a+b)
Negative Predictive Value (NPV) = d/(c+d)
PPV = probability disease given positive test; NPV = probability no disease given negative test.
Depend on prevalence (pretest probability). High prevalence → ↑ PPV, ↓ NPV; low prevalence → ↓ PPV, ↑ NPV.
📌 Example: Test with 95% sensitivity, 95% specificity.
Prevalence 50% → PPV = 95%, NPV = 95%.
Prevalence 1% → PPV ≈ 16%, NPV ≈ 99.9%.
Screening rare disease yields many false positives (low PPV).
📈 6.4 ROC Curves
ROC curve: plots sensitivity (y‑axis) vs. 1‑specificity (x‑axis) across all thresholds. Diagonal line = chance (AUC = 0.5).
Area Under Curve (AUC)
0.5–0.7: poor 0.7–0.8: acceptable 0.8–0.9: excellent 0.9–1.0: outstanding
Optimal cutoff: point closest to upper‑left corner, or chosen based on clinical consequences (screening vs. confirmation). Higher AUC indicates better overall test discrimination.
📌 Example: Test A AUC 0.85 vs. Test B AUC 0.75 → Test A discriminates better.
⚠️ 6.5 Bias in Screening
Lead‑Time Bias
Earlier detection increases survival time from diagnosis without changing age at death. Survival appears longer, but no true mortality benefit.
Length‑Time Bias
Screening preferentially detects slow‑growing, less aggressive disease, making the screened group appear to have better outcomes.
Overdiagnosis
Detection of disease that would never cause symptoms or death. Leads to unnecessary treatment, anxiety, and costs. Examples: DCIS, small thyroid cancers.
Volunteer Bias
Participants are healthier than non‑participants, inflating apparent screening benefits.
📊 Mitigation: Use randomized trials with mortality as endpoint; avoid reliance on survival time; be aware of overdiagnosis when considering screening programs.