1. Sensitivity, Specificity & Predictive Values
(Screening & Diagnostic Accuracy Measures)
When are they used?
Used when evaluating how good a screening or diagnostic test is.
Every patient falls into two realities:
- Disease present or disease absent
- Test positive or test negative
We arrange these in a two-way table:
Disease Present | Disease Absent | |
Test Positive | A (true positive) | B (false positive) |
Test Negative | C (false negative) | D (true negative) |
Sensitivity
“Positive in disease”
If someone has the disease, how often is the test positive?
Sensitivity = A / (A + C)
High sensitivity = low false negatives.
Specificity
“Negative in health”
If someone is healthy, how often is the test negative?
Specificity = D / (B + D)
High specificity = low false positives.
Positive Predictive Value (PPV)
If the test is positive, what is the chance the patient actually has the disease?
PPV = A / (A + B)
PPV depends heavily on prevalence.
Negative Predictive Value (NPV)
If the test is negative, what is the chance the patient is actually free of disease(healthy)?
NPV = D / (C + D)
Likelihood Ratio (LR)
How much a test result shifts probability.(
LR: If the test is positive, how much more likely the patient is to have the disease than not have it.
LR+ = Sensitivity / (1 – Specificity)
Values >10 = strong rule-in
Values <0.1 = strong rule-out
LR− (Negative Likelihood Ratio) ❗
This is the missing piece.
Question it answers:
If the test is negative, how much does it reduce the chance of disease?
Correct formula:
LR− = (1 − Sensitivity) / Specificity
Interpretation:
- <0.1 → strong rule-out
- 0.1–0.2 → moderate rule-out
- 1 → no diagnostic value
Example from the book
Blood test for gastric cancer (n = 100):
Cancer Present | Cancer Absent | |
Positive Test | 20 | 30 |
Negative Test | 5 | 45 |
Calculations:
- Sensitivity = 20/25 = 0.8 (80%)
- Specificity = 45/75 = 0.6 (60%)
- PPV = 20/50 = 0.4 (40%)
- NPV = 45/50 = 0.9 (90%)
- LR+ = 0.8 / (1 – 0.6) = 2
Interpretation:
- Test is good at ruling out (high NPV)
- Poor at ruling in (low PPV)
2. Level of Agreement & Kappa (κ)
(Inter-observer or Test–retest Agreement)
When is it used?
Used when:
- two observers classify something
- or one test is repeated
- and outcomes fall into categories (especially ordered categories)
Example: CIN1, CIN2, CIN3 classification in cervical smears.
Why not use simple percent agreement?
Because some agreement occurs by chance.
Kappa corrects for this.
Kappa (κ) — Interpretation
Kappa ranges from 0 to 1:
κ value | Interpretation |
0 | No agreement beyond chance |
0.3 | Poor agreement |
0.5 | Moderate agreement |
0.7 | Very good agreement |
1.0 | Perfect agreement |
Example from the book
Two hospitals read the same cervical smear slides:
- κ = 0.3 → poor agreement
Meaning: patients could get different results depending on the lab → unreliable classification.
When NOT to use Kappa
If measurements are continuous (e.g., blood glucose), use:
- Intraclass correlation coefficient (ICC) instead.
Summary for Exams
Sensitivity
Positive in disease → A / (A+C)
Specificity
Negative in health → D / (B+D)
PPV
If test positive → chance disease is present
NPV
If test negative → chance disease is absent
LR+
Sensitivity / (1 – Specificity)
Kappa (κ)
Agreement beyond chance
- 0.7 = good
- <0.4 = poor