| Literature DB >> 25264934 |
L Daniel Maxim1, Ron Niebo, Mark J Utell.
Abstract
Screening tests are widely used in medicine to assess the likelihood that members of a defined population have a particular disease. This article presents an overview of such tests including the definitions of key technical (sensitivity and specificity) and population characteristics necessary to assess the benefits and limitations of such tests. Several examples are used to illustrate calculations, including the characteristics of low dose computed tomography as a lung cancer screen, choice of an optimal PSA cutoff and selection of the population to undergo mammography. The importance of careful consideration of the consequences of both false positives and negatives is highlighted. Receiver operating characteristic curves are explained as is the need to carefully select the population group to be tested.Entities:
Keywords: Benefits and limitations; positive and negative predicted value; prevalence; screening tests; sensitivity; specificity
Mesh:
Year: 2014 PMID: 25264934 PMCID: PMC4389712 DOI: 10.3109/08958378.2014.955932
Source DB: PubMed Journal: Inhal Toxicol ISSN: 0895-8378 Impact factor: 2.724
Logical possibilities for true disease state and screening test outcome.
| Test result | Subject has disease | Subject disease free | Subtotal |
|---|---|---|---|
| Positive | Correct result | False positive | Total positive test results |
| Negative | False Negative | Correct result | Total negative test results |
| Subtotal | Total subjects with disease | Total subjects disease free | Total subjects |
Examples of screening and diagnostic tests and possible Gold Standards.
| Disease or condition | Screening tests | Gold Standard | References |
|---|---|---|---|
| Urinary tract infection | Urine microscopy | Urine culture | |
| Congenital heart disease | Exercise ECG | Coronary angiography | |
| Echocardiography | |||
| Hypertension | Blood pressure (Korotkoff sounds) | Intra-arterial measurement of pressures | |
| Myocardial infarction | EEG or cardiac enzymes | Cardiac biopsy (at autopsy) | |
| Breast cancer | Mammography | Biopsy result | |
| Bowel cancer | Fecal immunochemical test (FIT) and the fecal occult blood test (FOBT) | Colonoscopy ± biopsy | |
| TB | Tuberculin Skin Test; Interferon Gamma Release Assays | Chest X-ray and a sample of sputum, detection of | |
| Chlamydia | Tissue culture from single cervical swabs | Direct immunofluorescence, enzyme immunoassay, PCR and serology, others | |
| Cervical cancer | Pap smear | Colposcopy with appropriate biopsy or sentinel lymph node biopsy | |
| Celiac disease | IgG- and IgA-antigliadin antibodies, IgA-endomysial antibodies, and intestinal permeability | Small bowel biopsy |
Figure 1.Reported sensitivity and specificity of a sample of screening tests reported in the literature. Circles are studies summarized here (Table A1) while triangles represent studies reported in Alberg et al. (2004).
Table of test specificity and sensitivity results in the literature.
| References | Test Information | Specificity | Sensitivity | ||
|---|---|---|---|---|---|
| Multi-modal and ultrasound for ovarian cancer; primary ovarian and tubal | 0.998 | 0.894 | |||
| Primary invasive epithelial ovarian and tubal | 0.998 | 0.895 | |||
| USS | 0.982 | 0.75 | |||
| 0.987 | 0.763 | ||||
| Renal vascular hypertension screening test | 0.92 | 0.93 | |||
| HTLV-III (AIDS Agent) screening test | 0.986 | 0.973 | |||
| PTSD screening test | 0.975 | 0.77 | |||
| HPV testing thin-layer pap | 0.824 | 0.613 | |||
| PCR | 0.788 | 0.882 | |||
| Signal amplification | 0.726 | 0.908 | |||
| Peripheral neuropathy in Diabetes clinic | Vibration (on off) | 0.99 | 0.53 | ||
| Monofilament | 0.96 | 0.77 | |||
| Superficial pain | 0.97 | 0.59 | |||
| Vibration (timed) | 0.98 | 0.8 | |||
| Obstructive airway disease and >40 pack-years smoking | 0.986 | 0.284 | |||
| ABI and stroke | CHD | 0.908 | 0.163 | ||
| 0.944 | 0.167 | ||||
| Stroke | 0.908 | 0.17 | |||
| 0.887 | 0.22 | ||||
| 0.972 | 0.092 | ||||
| HPV DNA testing for cervical cancer | 0.942 | 0.771 | |||
| 0.934 | 0.748 | ||||
| HPV for cervical cancer (conservative case) | 0.941 | 0.946 | |||
| Pap for cervical cancer (conservative case) | 0.968 | 0.554 | |||
| Autologous serum skin tests to screen for chronic idiopathic urticaria | 0.81 | 0.65 | |||
| 0.78 | 0.71 | ||||
| B-natriuretic peptide for left ventricular dysfunction, 75 pg/mL BNP level | 0.98 | 0.86 | |||
| Probability of falls by timed up and go test | 0.87 | 0.87 | |||
| Endomysial antibody screening for coeliac disease, four tests | 0.99 | 1 | |||
| 0.99 | 0.91 | ||||
| 0.85 | 0.91 | ||||
| 0.88 | 0.76 | ||||
| Various tests for Chlamydia | PCR cervix | 1 | 0.965 | ||
| LCR urine | 1 | 0.875 | |||
| EIA urine | 1 | 0.188 | |||
| EIA cervix | 1 | 0.52 | |||
| EIA cervix | 0.99 | 0.8 | |||
| DNA probe | 0.96 | 0.72 | |||
| LET urine | 0.808 | 0.778 | |||
| EIA urine | 0.99 | 0.75 | |||
| EIA cervix | 1 | 0.844 | |||
| PCR cervix | 1 | 1 | |||
| LCR urine | 1 | 0.96 | |||
| EIA urine | 1 | 0.37 | |||
| EIA cervix | 1 | 0.783 | |||
| PCR cervix | 1 | 1 | |||
| PCR cervix | 1 | 0.85 | |||
| PCR cervix | 1 | 0.953 | |||
| PCR cervix | 0.986 | 1 | |||
| PCR urine | 0.986 | 0.923 | |||
| LCR cervix | 0.997 | 0.886 | |||
| PCR, EIA cervix | 0.997 | 0.97 | |||
| LET | 0.91 | 0.41 | |||
| LCR and LET urine | 0.949 | 0.589 | |||
| PCR urine | 0.997 | 0.82 | |||
| PCR cervix | 0.998 | 0.82 | |||
| PACE2 cervix | 1 | 0.795 | |||
| PCR urine | 0.99 | 0.85 | |||
| DFA cervix | 0.96 | 0.85 | |||
| LCR urine | 1 | 0.882 | |||
| EIA | 1 | 0.84 | |||
| PCR cervix | 0.998 | 0.992 | |||
| LCR, PCR | 1 | 0.93 | |||
| LCR, PCR | 0.996 | 0.62 | |||
| DFA cervix | 0.995 | 0.778 | |||
| PCR cervix | 1 | 0.714 | |||
| EIA cervix | 1 | 0.647 | |||
| PCR urine | 0.993 | 0.895 | |||
| Five cervical cancer screening tests ( | VIA | 0.836 | 0.887 | ||
| VILI | 0.832 | 0.957 | |||
| VIAM | 0.855 | 0.826 | |||
| Pap Smear | 0.985 | 0.651 | |||
| HC2 | 0.93 | 0.721 | |||
| Fasting glucose to insulin ratio to measure insulin sensitivity | 0.84 | 0.95 | |||
| Noninvasive determination of endothelium-mediated vasodilation | Coronary artery disease | 0.81 | 0.71 | ||
| Angina pectoris | 0.571 | 0.824 | |||
| Four tests for colorectal-screening | Hemoccult II | 0.981 | 0.324 | ||
| Hemoccult II Sensa | 0.875 | 0.712 | |||
| Hemoselect | 0.952 | 0.672 | |||
| Combined | 0.979 | 0.537 | |||
| Pulse oximetry screening for congenital heart defects | Critical cases | 0.9912 | 0.75 | ||
| All major cases | 0.9916 | 0.4906 | |||
| Saliva polymerase chain reaction assay for cytomegalovirus | Liquid Saliva | 0.999 | 1 | ||
| Dried Saliva | 0.999 | 0.974 | |||
| Several colorectal cancer screening tests | 0.94 | 0.85 | |||
| 0.944 | 0.688 | ||||
| 0.91 | 0.875 | ||||
| 0.949 | 0.865 | ||||
| 0.831 | 0.667 | ||||
| 0.969 | 0.818 | ||||
| 0.971 | 0.556 | ||||
| 0.956 | 0.909 | ||||
| Six human papillomavirus tests | BD HPV | 0.843 | 0.975 | ||
| Roche Cobas | 0.845 | 0.975 | |||
| Qiagen Hybrid | 0.854 | 0.975 | |||
| Abbott real time | 0.872 | 0.95 | |||
| Gen-probe | 0.902 | 0.975 | |||
| NorChip | 0.952 | 0.714 | |||
| Various tests for gestational diabetes | 50-G OGCT | 0.86 | 0.85 | ||
| 50-G OGCT | 0.84 | 0.88 | |||
| 50-G OGCT | 0.83 | 0.85 | |||
| 50-G OGCT | 0.69 | 0.81 | |||
| 50-G OGCT | 0.89 | 0.7 | |||
| 50-G OGCT | 0.77 | 0.99 | |||
| 50-G OGCT | 0.66 | 0.88 | |||
| 50-G OGCT | 1 | 0.17 | |||
| Fasting plasma glucose | 0.52 | 0.87 | |||
| Fasting plasma glucose | 0.76 | 0.77 | |||
| Fasting plasma glucose | 0.92 | 0.76 | |||
| Fasting plasma glucose | 0.93 | 0.54 | |||
| HbA 1c | 0.28 | 0.92 | |||
| HbA 1c | 0.97 | 0.12 | |||
| HbA 1c | 0.61 | 0.86 | |||
| HbA 1c | 0.21 | 0.82 | |||
| MRI and mammographic screening in survivors of Hodgkin Lymphoma | Mammogram | 0.93 | 0.68 | ||
| MRI | 0.94 | 0.67 | |||
| Both | 0.9 | 0.94 | |||
| Various tests for syphilis (imperfect reference) | |||||
| Determine | Serum | 0.9415 | 0.9004 | ||
| Whole Blood | 0.9585 | 0.8632 | |||
| SD Bioline | Serum | 0.9585 | 0.8706 | ||
| Whole Blood | 0.9795 | 0.845 | |||
| Syphicheck | Serum | 0.9914 | 0.7448 | ||
| Whole Blood | 0.9958 | 0.7447 | |||
| Visitect | Serum | 0.9645 | 0.8513 | ||
| Whole Blood | 0.9943 | 0.7426 | |||
| Various tests for prostate cancer | Optimized | 0.9 | 0.8 | ||
| Various tests for blood-based breast cancer screening | RASSF1A UTIH5 | 0.73 | 0.54 | ||
| RASSF1A DKK3 | 0.75 | 0.59 | |||
| DKK3 ITIH5 | 0.94 | 0.4 | |||
| RASSF1A DKK3 ITH5 | 0.72 | 0.67 | |||
| Cervical cancer screening methods in HIV positive women CIN 2+ | Cytology (MD intern) | 0.681 | 0.755 | ||
| HPV | 0.514 | 0.919 | |||
| Cytology (RN intern) | 0.685 | 0.654 | |||
| Breast tomosynthesis compared to mammography for detection of cancer | Mammography | 0.883 | 0.963 | ||
| Tomosynthesis | 0.867 | 0.963 | |||
| Breast tomosynthesis compared to mammography for detection of cancer | Mammography | 0.841 | 0.655 | ||
| Mammography plus | 0.892 | 0.762 | |||
| Tomosynthesis | 0.862 | 0.627 | |||
| Mammography | 0.845 | 0.787 | |||
| Mammography plus Tomosynthesis | |||||
| Prostate-specific antigen in serum screening test | Rectal examination | 0.44 | 0.86 | ||
| Ultrasonography | 0.27 | 0.92 | |||
| Serum PSA | 0.59 | 0.79 | |||
Common sources of bias in study design.
| Type of bias | Description |
|---|---|
| Verification bias | Non-random selection for definitive assessment for disease with the old standard reference test |
| Errors in the reference | True disease status is subject to misclassification because the gold standard is imperfect |
| Spectrum bias | Types of cases and controls included are not representative of the population |
| Test interpretation bias | Information is available that can distort the diagnostic test |
| Unsatisfactory tests | Tests that are uninterpretable or incomplete do not yield a test result |
| Extrapolation bias | The conditions or characteristics of populations in the study are different from those in which the test will be applied |
| Lead time bias | Earlier detection by screening may erroneously appear to indicate beneficial effects on the outcome of a progressive disease |
| Length bias | Slowly progressing disease is over-represented in screened subjects relative to all cases of disease that arise in the population |
| Overdiagnosis bias | Subclinical disease may regress and never become a clinical problem in the absence of screening, but is detected by screening |
Source: Pepe (2003).
Hypothetical data from screening experiment.
| Raw Data | In symbols | Numerical illustration | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Actual disease state | Actual disease state | |||||||||||
| Test Result | Yes | No | Subtotal | Test Result | Yes | No | Subtotal | |||||
| Positive | a | b | a + b | Positive | 4500 | 3500 | 8000 | |||||
| Negative | c | d | c + d | Negative | 500 | 1500 | 2000 | |||||
| Sub totals | a + c | b + d | N | Subtotal | 5000 | 5000 | 10 000 | |||||
| Definitions | Term | Definition | Formula | Numerical Result | Alternative Formula or Term | |||||||
| Prevalence, Π | Fraction of test subjects with disease | (a + c)/N | 0.5000 | Assumed | ||||||||
| Sensitivity, S | Fraction of subjects with positive test given that test subject has disease; “true positive/disease” | a/(a + c) | 0.9000 | Hypothetical data show relatively high sensitivity | ||||||||
| False negative rate | Fraction of subjects with disease, but with negative test result | c/(a + c) | 0.1000 | (1 − S) | ||||||||
| Specificity, Sp | Fraction of test subjects with negative test given that the test subject does not have disease | d/(b + d) | 0.3000 | Hypothetical data show relatively low specificity | ||||||||
| False positive rate | Fraction of test subjects with no disease, but positive test result | b/(b + d) | 0.7000 | (1 − Sp) | ||||||||
| Probability of positive test | True positives + false positives divided by total tests | (a + b)/N | 0.8000 | P(T+) = ΠS + (1 − Π)(1 − Sp) | ||||||||
| Probability of negative test | True negatives + false negatives divided by total tests | (c + d)/N | 0.2000 | P(T−) = Π(1 − S) + (1 − Π)Sp | ||||||||
| Positive predictive value PPV | Post-test probability of disease given a positive result | a/(a + b) | 0.5625 | |||||||||
| Negative predictive value NPV | Post-test probability of no disease given a negative test result | d/(c + d) | 0.750 | |||||||||
| Accuracy | Proportion of correct test results | (a + d)/N | 0.6000 | ΠS + (1 − Π)Sp | ||||||||
| Likelihood ratio | The probability of a subject who has the disease testing positive divided by the probability of a subject who does not have the disease testing positive | S/(1 − Sp) | 1.2857 | |||||||||
| Regret given positive test | Probability that disease free subject has positive test | b/(a + b) | 0.4375 | (1 − Π)(1 − Sp)/(ΠS + (1 − Π)(1 − Sp)) | ||||||||
| Bayes Theorem | Positive test | Negative test | ||||||||||
| True state | P(Hi) | P(T+/Hi) Probability of positive test in this state | P(Hi)P(T+/Hi) Joint probability | P(Hi/T+) | True state | P(Hi) | P(/T−Hi) Probability of negative test in this state | P(Hi)P(T−/Hi) Joint probability | P(Hi/T−) | |||
| Disease | 0.5000 | 0.90 | 0.4500 | 0.5625 | Disease | 0.5000 | 0.10 | 0.0500 | 0.2500 | |||
| No disease | 0.5000 | 0.70 | 0.3500 | 0.4375 | No disease | 0.5000 | 0.30 | 0.1500 | 0.7500 | |||
| Probability of positive test = P(T+) | 0.8000 | Probability of negative test = P(T−) | 0.2000 | |||||||||
Additional references providing useful background: Alberg et al. (2004), Eddy (1982), Goetzinger & Odibo (2011), Lalkhen & McClusky (2008).
Figure 2.Positive predictive value (PPV), negative predictive value (NPV) and accuracy as a function of assumed prevalence for first numerical example.
Figure 3.Regret (1 – PPV) as a function of prevalence ∏ and specificity for example in Table 4 assuming sensitivity held constant at 0.90.
Figure 4.Combination of values of prevalence, specificity and sensitivity associated with a regret probability of 0.80.
Reported false positive rates for CT scans for lung cancer.
| Reported false positives as % | Remarks | Source |
|---|---|---|
| 96.4 | National Lung Screening Trial Research Team, p. 399 | |
| 96.1 | Study also reports 90% sensitivity | |
| 95.5 | 106 false positives among 111 with nodules >0.5 cm | |
| 92.9–96.0 | Rates depended on nodule size, p. 260. | |
| 86.6–96.4 | Rates depend upon assumed nodule size from 5.0 to 9.0 mm | |
| 94.6 | Based on 14 detected cancers among 259 patients with abnormal CT scans | |
| 94.1 | From | |
| 93 | Based on 8 lung cancers among 114 subjects with nodules >5 mm | |
| 92.6 | Based on 22 lung cancers among 298 patients with nodules | |
| 92.1 | Based on 22 cancers in 279 with suspicious nodules | |
| 88.5–97 | From | |
| 87.6 | Based on 29 malignancies among 233 positive results | |
| 75 | Percent of patients with non-calcified nodules on CT | |
| 73.4 | Based on 163 benign nodules among 222 evaluated by thin section CT | |
| >70 | Reported value derived from Mayo clinic and ELCAP trials | |
| 62.1 | Based on 18 false positives among 29 subjects; for nodules >10 mm | |
| 43.75 | Based on 36 confirmed lung cancer cases among 64 patients | |
| 21–33 | Rates depend upon number of tests, p. 509. Of participants with a false-positive CT scan, 7% had an unnecessary invasive procedure and 2% had major surgery for benign disease. | |
| 19 | p. 119 | |
| 7.9 | p. 612. Includes multi-stage process with classification of nodules by size and calcification with follow-up. | |
| 7.9 M/5.6 F | Sensitivity reported to range between 84.6% W to 90.6% M | |
| 1.7 | Sensitivity reported at 94.6%, based on Volume CT scanning |
Figure 5.Positive predictive value from mammography for women in various age groups with and without a family history of cancer according to data provided in Kerlikowske et al. (1993).
Figure 6.Receiver operating characteristic curve of prostate specific antigen (PSA) test, based on data from Thompson et al. (2005) among men aged 70 or more. Numbers shown are the specific cutoff on the PSA test result. The area under the curve (AUC) in this case is 0.678.
Figure 7.Receiver operating characteristic curves of prostate specific antigen (PSA) test, based on data from Thompson et al. (2005) among men aged 70 or more (AUC = 0.678). The top curve uses a combined PSA and Gleason Grade > 8 score (AUC = 0.827). The bottom curve is what would be expected by chance alone (AUC = 0.50).
Circumstances/conditions when screening might be appropriate or contra-indicated.
| Circumstances favoring screening | Circumstances when screening not appropriate |
|---|---|
| Disease constitutes a significant public health problem, meaning that it is a relatively common condition with significant morbidity and mortality or disease is contagious and might infect others before symptoms occur and disease detected. | Disease is rare or not serious or, if serious there is no effective treatment for disease. |
| The population to be screened can be so defined that the prevalence is high and there are no significant co-morbidities. | Unknown or low population prevalence |
| Treatment before symptoms occur is more effective than if treatment is delayed | No benefit to early treatment and/or significant likelihood of overdiagnosis (pseudodisease) |
| “Gold Standard” diagnostic exists and screening test sensitivity and specificity is high and based on adequate sample size | Screening test data is based on small sample sizes or is difficult to extrapolate to larger pool of screening centers with high sensitivity and specificity (e.g. high inter-observer variability) |
| Consequences of false negative or false positives are modest | Consequences of one or more of these errors significant |
| Screening test is inexpensive, easy to administer, not harmful and reliable | Any of these circumstances not met |
| There must be some mechanism for follow-up of subjects with positive screening results to ensure subsequent diagnostic testing and ultimate treatment takes place. |
Sources: Grimes & Schutz (2002), Herman (2006), Wilson & Jungner (1968).