| Literature DB >> 28243332 |
Xuye Yuan1, Jiajia Chen2, Yuxin Lin1, Yin Li1, Lihua Xu3, Luonan Chen4, Haiying Hua5, Bairong Shen1.
Abstract
Leukemia is a leading cause of cancer deaths in the developed countries. Great efforts have been undertaken in search of diagnostic biomarkers of leukemia. However, leukemia is highly complex and heterogeneous, involving interaction among multiple molecular components. Individual molecules are not necessarily sensitive diagnostic indicators. Network biomarkers are considered to outperform individual molecules in disease characterization. We applied an integrative approach that identifies active network modules as putative biomarkers for leukemia diagnosis. We first reconstructed the leukemia-specific PPI network using protein-protein interactions from the Protein Interaction Network Analysis (PINA) and protein annotations from GeneGo. The network was further integrated with gene expression profiles to identify active modules with leukemia relevance. Finally, the candidate network-based biomarker was evaluated for the diagnosing performance. A network of 97 genes and 400 interactions was identified for accurate diagnosis of leukemia. Functional enrichment analysis revealed that the network biomarkers were enriched in pathways in cancer. The network biomarkers could discriminate leukemia samples from the normal controls more effectively than the known biomarkers. The network biomarkers provide a useful tool to diagnose leukemia and also aids in further understanding the molecular basis of leukemia.Entities:
Keywords: integrative analysis; leukemia.; network biomarker
Year: 2017 PMID: 28243332 PMCID: PMC5327377 DOI: 10.7150/jca.17302
Source DB: PubMed Journal: J Cancer ISSN: 1837-9664 Impact factor: 4.207
Figure 1The flowchart of network biomarkers identification for leukemia diagnosis.
Source databases of PINA.
| Original database | Version | Ref. | Link |
|---|---|---|---|
| IntAc | Oct 4,2012 | [21] | |
| BioGRID | 3.1.93 | [22] | |
| MINT | Dec 21,2010 | [23] | |
| DIP | June 14,2010 | [24] | |
| HPRD | April 13,2010 | [25] | |
| MIPS/Mpact | Oct 1,2008 | [26] |
Leukemia-associated gene expression datasets used for analysis.
| Series | Platform | No. Samples | Leukemia | Others | Normal | Ref. | |||
|---|---|---|---|---|---|---|---|---|---|
| AML | CLL | T-PLL | B-CLL | ||||||
| GSE9476 | GPL96 | 64 | 26 | 38 | [27] | ||||
| GSE6691 | GPL96 | 56 | 11 | 32 | 13 | [28] | |||
| GSE5788 | GPL96 | 14 | 6 | 8 | [29] | ||||
| GSE22529 | GPL96 | 52 | 41 | 11 | [30] | ||||
| GSE26725 | GPL570 | 17 | 12 | 5 | [31] | ||||
| GSE23293 | GPL570 | 41 | 7 | 18 | 16 | [32] | |||
Leukemia-associated gene expression datasets used for validation.
| Series | Platform | No. of samples | CML | CLL | Normal | Ref. |
|---|---|---|---|---|---|---|
| GSE8835 | GPL96 | 66 | 42 | 24 | [33] | |
| GSE24739 | GPL570 | 24 | 16 | 8 | [34] | |
| GSE39411 | GPL570 | 152 | 104 | 48 | [35] |
Detailed information of the active modules.
| PPI_GSE5788_TR | PPI_GSE6691_ TR | PPI_GSE9476_ TR | PPI_GSE22529_ TR | PPI_GSE23293_ TR | PPI_GSE23293_ TR | |
|---|---|---|---|---|---|---|
| Nodes | 77 | 71 | 75 | 73 | 71 | 75 |
| Edges | 205 | 186 | 166 | 193 | 126 | 188 |
Figure 2The final network-based biomarker for leukemia. The known cancer related genes in final network are marked yellow.
Figure 3IPA and KEGG pathway enrichment analysis for network biomarkers. The top 10 most significantly enriched IPA and KEGG pathway are shown in panel (A) and (B) respectively.
Figure 4Gene ontology annotation for the network biomarkers. The network biomarkers identified by our method were annotated with DAVID tools at three levels of gene ontology: Molecular Function, Biological Process, and Cellular Component. The top 10 most significantly enriched items for each level are shown.
Figure 5Validation of the network biomarkers. (A) Distribution of the leukemia-associated genes in the network. (B-D) ROC curves obtained with the network biomarkers tested by three gene expression datasets. Panel (B), (C) and (D) represent respectively the results of the gene expression datasets in series of GSE8835, GSE24739 and GSE39411.
Detailed information of ROC curves.
| Series | Biomarker | Sensitivity | Precision | Specificity | Accuracy | AUC |
|---|---|---|---|---|---|---|
| GSE8835 | CD38 | 0.913 | 0.700 | 0.212 | 0.658 | 0.629 |
| BCL2 | 0.885 | 0.650 | 0.166 | 0.623 | 0.604 | |
| IGFBP7 | 0.965 | 0.665 | 0.148 | 0.668 | 0.584 | |
| Network biomarkers | 0.851 | 0.686 | 0.316 | 0.657 | 0.698 | |
| GSE24739 | CD38 | 0.893 | 0.662 | 0.088 | 0.625 | 0.431 |
| BCL2 | 0.943 | 0.654 | 0.004 | 0.630 | 0.237 | |
| IGFBP7 | 0.938 | 0.652 | 0.001 | 0.625 | 0.197 | |
| Network biomarkers | 0.874 | 0.986 | 0.976 | 0.908 | 0.966 | |
| GSE39411 | CD38 | 0.999 | 0.725 | 0.177 | 0.740 | 0.981 |
| BCL2 | 0.915 | 0.931 | 0.853 | 0.895 | 0.966 | |
| IGFBP7 | 0.886 | 0.797 | 0.513 | 0.768 | 0.880 | |
| Network biomarkers | 0.996 | 0.999 | 0.998 | 0.997 | 0.999 |