| Literature DB >> 19476629 |
Jonathan L Jesneck1, Sayan Mukherjee, Zoya Yurkovetsky, Merlise Clyde, Jeffrey R Marks, Anna E Lokshin, Joseph Y Lo.
Abstract
BACKGROUND: Because screening mammography for breast cancer is less effective for premenopausal women, we investigated the feasibility of a diagnostic blood test using serum proteins.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19476629 PMCID: PMC2696469 DOI: 10.1186/1471-2407-9-164
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.430
Subject demographics
| Normal | Benign | Malignant | Total | |
|---|---|---|---|---|
| Number of subjects | 68 (41%) | 48 (29%) | 49 (30%) | 165 (100%) |
| Mean age (years) | 36 ± 8 | 38 ± 9 | 42 ± 4 | 38 ± 8 |
| Race: Black | 23 (41%) | 19 (34%) | 14 (25%) | 56 (34%) |
| Race: White | 45 (41%) | 29 (27%) | 35 (32%) | 109 (66%) |
List of the 98 serum proteins measured by ELISA assay (Luminex platform)
| ACTH | Adiponectin | AFP | Angiostatin |
|---|---|---|---|
| Apolipoprotein A1 | Apolipoprotein Apo A2 | Apolipoprotein Apo B | Apolipoprotein Apo C2 |
| Apolipoprotein Apo C3 | Apolipoprotein Apo E | CA 15-3 | CA 19-9 |
| CA-125 | CA72-4 | CD40L (TRAP) | CEA |
| Cytokeratin 19 | DR5 | EGF | EGFR |
| EOTAXIN | ErbB2 | FGF-b | Fibrinogen |
| Fractalkine | FSH | G-CSF | GH |
| GM-CSF | GROa | Haptoglobin | HGF |
| IFN-a | IFN-g | IGFBP-1 | IL-10 |
| IL-12p40 | IL-13 | IL-15 | IL-17 |
| IL-1a | IL-1b | IL-1Ra | IL-2 |
| IL-2R | IL-4 | IL-5 | IL-6 |
| IL-7 | IL-8 | IP-10 | Kallikrein 10 |
| Kallikrein 8 | Leptin | LH | MCP-1 |
| MCP-2 | MCP-3 | Mesothelin(IgY) | MICA |
| MIF | MIG | MIP-1a | MIP-1b |
| MMP-1 | MMP-12 | MMP-13 | MMP-2 |
| MMP-3 | MMP-7 | MMP-8 | MMP-9 |
| MPO | NGF | PAI-I(active) | PROLACTIN |
| RANTES | Resistin | S-100 | SAA |
| SCC | sE-Selectin | sFas | sFasL |
| sICAM-1 | sIL-6R | sVCAM-1 | TGFa |
| TNF-a | TNF-RI | TNF-RII | tPAI-1 |
| TSH | TTR | ULBP-1 | ULBP-2 |
| ULBP-3 | VEGF | ||
Figure 1ROC curves showing the classification performance of statistical models using the serum protein levels. The models were run with a 70% train and 30% test split of the data set (A-C) and also with leave-one-out cross-validation (LOOCV) (D-F). The classifiers performed similarly, with moderate classification results for normal vs. malignant or benign lesions (A, B, D, E) and poor classification results for malignant vs. benign lesions (C, F).
Cross-validation classification errors, normal versus cancer
| Model | FN | FP |
|---|---|---|
| BMA of linear models | 19 | 8 |
| BMA of logistic models | 15 | 12 |
| BMA of probit models | 18 | 7 |
| SVM with RFS | 18 | 12 |
| LAR | 24 | 5 |
Figure 2Posterior predictions of Bayesian model averaging (BMA) of probit models, run with a 70% train and 30% test split of the data set (A-C) and also with leave-one-out cross-validation (LOOCV) (D-F). The classifiers achieved moderate classification results for normal vs. malignant or benign lesions (A, B, D, E) and poor classification results for malignant vs. benign lesions (C, F).
Figure 3Models selected by BMA of linear models. Features are plotted in decreasing posterior probability of being nonzero. Models are ordered by selection frequency, with the best, most frequently selected models on the left and the weakest, rarest chosen on the right. Coefficients with positive values are shown in red and negative values in blue. Strong, frequently selected features appear as solid horizontal stripes. A beige value indicates that the protein was not selected in a particular model.
Proteins chosen by BMA of linear models
| Protein | Biological Role | Higher Prevalence In |
|---|---|---|
| MIF (macrophage migration inhibitory factor) | Inflammation, regulates macrophage function in host defense through the suppression of anti-inflammatory effects of glucocorticoids [ | Cancer |
| MMP-9 (matrix metalloproteinase) | Breakdown and remodeling of extracellular matrix [ | Cancer |
| MPO (myeloperoxidase) | Inflammation, produces HOCl, modulates vascular signaling and vasodilatory functions of nitric oxide (NO) during acute inflammation [ | Normal |
| sVCAM-1 (soluble vascular cell adhesion molecule 1) | Mediates leukocyte-endothelial cell adhesion and signal transduction, membrane-bound adhesion molecules and the process of vascular inflammation of the vessel wall [ | Normal |
| ACTH (adrenocorticotropic hormone) | Stimulates secretion of adrenal corticosteroids [ | Cancer |
| MIF (macrophage migration inhibitory factor) | Inflammation, regulates macrophage function in host defense through the suppression of anti-inflammatory effects of glucocorticoids [ | Benign |
| MICA (MHC class I polypeptide-related sequence A) | Stress-induced antigen that is broadly recognized by intestinal epithelial gamma delta T cells, ligands for natural killer cells [ | Benign |
| IL-5 (Interleukin 5) | Stimulates B cell growth, increases immunoglobulin secretion, mediates eosinophil activation [ | Normal |
| IL-12 p40 (Interleukin 12, p40 chain) | Differentiation of naive T cells into Th1 cells, stimulates the growth and function of T cells, stimulates the production of interferon-gamma (IFN-γ) and [ | Normal |
| MCP-1 (Monocyte chemotactic protein-1) | Induces recruitment of monocytes, T lymphocytes, eosinophils, and basophils and is responsible for many inflammatory reactions to disease [ | Benign |
| CA-125 (cancer antigen 125) | Marker for ovarian cancer [ | Cancer |
| IFNa (Interferon type I) | Secreted by leukocytes, fibroblasts, or lymphoblasts in response to viruses [ | Benign |
| MICA (MHC class I polypeptide-related sequence A) | Stress-induced antigen that is broadly recognized by intestinal epithelial gamma delta T cells, ligands for natural killer cells [ | Benign |
Figure 4Posterior distributions of the model coefficients for the proteins. The distributions are mixtures of a point mass at zero and a normal distribution. The height of the solid line at zero represents the posterior probability that the coefficient is zero. The nonzero part of the distribution is scaled so that the maximum height is equal to the probability that the coefficient is nonzero.
Figure 5Heatmap of normalized frequencies of selected features, normal vs. cancer. The feature selection frequencies were averaged over all folds of the LOOCV. For comparison across techniques, the frequencies in each column were scaled to sum to one. Less-frequently selected features appear as cooler dark blue colors, whereas more frequently selected features appear as hotter, brighter colors. Models that used fewer features appear as dark columns with a few bright bands, whereas models that used more features appear as denser smears of darker bands.
Figure 6ROC and accuracy curves for linear models with four feature selection techniques. 1) Preselected: the features (using all the data to choose the best features, and then running the model using only those preselected features in LOOCV), 2) BMA: iterated Bayesian model averaging, 3) Stepwise feature selection, and 4) All features: using all the proteins in the model, no feature selection.