| Literature DB >> 25114662 |
Michael R Mehan1, Stephen A Williams1, Jill M Siegfried2, William L Bigbee3, Joel L Weissfeld4, David O Wilson3, Harvey I Pass5, William N Rom6, Thomas Muley7, Michael Meister7, Wilbur Franklin8, York E Miller9, Edward N Brody1, Rachel M Ostroff1.
Abstract
BACKGROUND: CT screening for lung cancer is effective in reducing mortality, but there are areas of concern, including a positive predictive value of 4% and development of interval cancers. A blood test that could manage these limitations would be useful, but development of such tests has been impaired by variations in blood collection that may lead to poor reproducibility across populations.Entities:
Keywords: Biomarker; Diagnosis; Lung cancer; Preanalytic variability; Proteomic; SOMAmer; Sample bias; Squamous cell carcinoma
Year: 2014 PMID: 25114662 PMCID: PMC4123246 DOI: 10.1186/1559-0275-11-32
Source DB: PubMed Journal: Clin Proteomics ISSN: 1542-6416 Impact factor: 3.988
Figure 1Probability Density Function plots of HSP90 and MMP7 distributions for each training site control group. (a) HSP90 is an example of a protein affected by preanalytic variability and the plot demonstrates the bias between control groups. (b) MMP7, selected from the high quality training samples, is consistent between sites.
Demographics of training and validation study cohorts
| No. subjects | 94 | 269 | 111 | 27 | 63 | 72 |
| Median age (years) | 69 | 57 | 62 | 58 | 68 | 71 |
| Interquartile range | 63-74 | 52-64 | 54-70 | 51-71 | 62-74 | 66-76 |
| Gender | | | | | | |
| Male | 42 | 126 | 70 | 13 | 36 | 39 |
| Female | 52 | 143 | 41 | 14 | 27 | 33 |
| Median pack-years* | 40 | 40 | 35 | 30 | 60 | 38 |
| Interquartile range | 23-57 | 20-56 | 20-50 | 18-43 | 40-76 | 34-60 |
| Histopathology/Stage | | | | | | |
| Adenocarcinoma | 64 | | 55 | | 38 | |
| I | 27 | | 19 | | 24 | |
| II | 7 | | 16 | | 5 | |
| III | 19 | | 20 | | 7 | |
| IV | 11 | | 0 | | 2 | |
| Squamous cell | 30 | | 56 | | 25 | |
| I | 11 | | 20 | | 14 | |
| II | 5 | | 16 | | 4 | |
| III | 12 | | 20 | | 3 | |
| IV | 2 | | 0 | | 4 | |
| Benign nodule | 122 | 27 | 20 | |||
*Pack-years is defined as the product of the total number of years of smoking and the average number of packs of cigarettes smoked daily.
Smoking data was not available for 8 subjects in training, 21 in UHH and 10 in EDRN studies.
Figure 2Training set ROC. Results are plotted for the entire data set and for AD and SQ tumor histologies separately.
Lung cancer classifier proteins ranked by gini importance score
| MMP12 [4321] | Matrix metallo-peptidase 12 | 20.33 | Breakdown of extracellular matrix, positive regulation of cell proliferation, tissue injury and remodeling | Up |
| SERPINA3 [12] | Apha-1 antiproteinase | 14.11 | Serine protease inhibitor, part of acute phase response and tissue homeostasis | Up |
| MMP7 [4316] | Matrix metallo-peptidase 7 | 13.73 | Breakdown of extracellular matrix, positive regulation of cell proliferation, collagen catabolism, degrades fibronectin | Up |
| C9 [735] | Complement component 9 | 11.31 | Inflammatory acute phase reactant, pore-forming subunit of cytolytic MAC complex | Up |
| CRP [1401] | C-reactive protein | 11.01 | Inflammatory acute phase reactant, immune effector | Up |
| CNDP1 [84735] | Carnosine dipeptidase 1 | 8.66 | Carboxypeptidase, functions in amino acid transport and metabolism | Down |
| CA6 [765] | Carbonic anhydrase VI | 7.62 | Reversible hydratation of carbon dioxide, one-carbon metabolism, nitrogen metabolism | Down |
Figure 3UHH validation ROC. Results are plotted for the entire data set and for AD and SQ tumor histologies separately.
Figure 4EDRN validation ROC. Results are plotted for the entire data set and for AD and SQ tumor histologies separately.
Performance of the classifier in training and validation studies
| | ||||||
|---|---|---|---|---|---|---|
| Data set | | | | | | |
| Training set | 33 | 64 | 59 | 84 | 89 | 90 |
| 95% CI | 19-52 | 35-85 | 43-74 | 62-95 | 82-93 | 86-93 |
| UHH validation | 35 | 75 | 51 | 83 | 81 | 81 |
| 95% CI | 18-57 | 53-89 | 36-67 | 68-93 | 63-92 | 63-92 |
| EDRN validation | 50 | 93 | 79 | 82 | 70 | 71 |
| 95% CI | 31-69 | 66-100 | 52-93 | 51-96 | 48-86 | 59-80 |
Sensitivity of the classifier in training and validation is calculated by tumor stage and histology. Specificity is calculated for the benign nodule subset and for all controls. 95% CI is the 95% confidence interval.
Figure 5Study flowchart for biomarker discovery and validation studies.