Correlation coefficients, linear and logistic regression, probability theory, decision trees, likelihood ratios, and Bayes’ theorem — the statistical tools for modeling relationships and updating diagnostic certainty.
Pearson correlation coefficient (r): measures strength and direction of a linear relationship between two continuous variables. Range –1 to +1. |r| > 0.7 = strong; 0.3–0.7 = moderate; <0.3 = weak. r² (coefficient of determination) = proportion of variance in Y explained by X.
Spearman rank correlation (ρ): non‑parametric alternative for ordinal data or non‑normal distributions. Correlation ≠ causation – associations may be due to confounding, reverse causation, or chance.
Slope (b): change in Y per one‑unit increase in X. Intercept (a): value of Y when X = 0. R²: proportion of variance explained. Residuals (observed – predicted) should be randomly scattered (homoscedasticity) and approximately normal.
Allows adjustment for confounders. Each coefficient (b₁) represents the change in Y per unit change in X₁, holding all other variables constant. Adjusted R² penalizes addition of irrelevant predictors. Multicollinearity (high correlation between predictors) inflates standard errors; detected via VIF.
Used for binary outcomes (disease/no disease). Models log‑odds: log(p/(1‑p)) = a + b₁X₁ + …
Model discrimination: C‑statistic (AUC) measures ability to distinguish cases from controls. AUC 0.7‑0.8 = acceptable, 0.8‑0.9 = excellent.
LR+ > 10 → large increase in posttest probability; LR‑ < 0.1 → large decrease. LRs are independent of prevalence and can be multiplied for sequential independent tests.
The Fagan nomogram provides a graphical shortcut. Use posttest probability from one test as pretest for the next.