| Literature DB >> 25888350 |
Xindong Zhang1,2, Lin Gao3,4, Zhi-Ping Liu5, Luonan Chen6,7,8.
Abstract
BACKGROUND: Identifying diagnosis and prognosis biomarkers from expression profiling data is of great significance for achieving personalized medicine and designing therapeutic strategy in complex diseases. However, the reproducibility of identified biomarkers across tissues and experiments is still a challenge for this issue.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25888350 PMCID: PMC4374500 DOI: 10.1186/s12859-015-0519-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Overview of the proposed framework for identifying module biomarker.
Figure 2Computational strategy for generating discriminative modules. Computational strategy for generating discriminative modules by maximizing discriminative area of module activity. The discriminative area is defined as the area under two probability density functions of module activities corresponding to normal samples and case (disease) samples.
Figure 3Network structure of identified module. Network structure of identified module which contains 32 genes, where diamond denotes that the gene is a causal gene of T2DM by quering T2D-Db or GAD, hexagon denotes that the gene is a T2DM related gene by functional correlation.
Figure 4Performance analysis of the identified module biomarker. (A) The robustness of classification accuracy in perturbation data with different ratio of artificial noises. The mean accuracy of the proposed classifier decreases progressively from 84.02% to 73.26% when ratio of noise increases from 1% to 10%. (B) Comparison of biomarkers identified by different methods in GSE18732. ROC curves shows a superior performance in classification of module biomarker identified in this work (AUC = 0.96). (C) Histogram of mean accuracy with variance for biomarkers identified by our method, SVM-RFE and PAC. We also randomized the interactions of background network (PPIs) 50 times and identified a module biomarker using the proposed method, then mean accuracy and variance are calculated for 10-fold cross-validation across 5 datasets used in this work. Results show a stable performance across tissues for identified biomarkers.
Accuracy of different biomarkers across experiments by 10-fold cross-validation
|
|
|
| ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
| ||
|
|
| 80% | 80% |
| 83.33% |
|
|
| 67.39% | 80% | 85% | 58.82% | 75% |
|
|
| 84.78% | 75% | 60% | 47.06% | 83.33% |
|
| 32 top differentially expressed genes | 73.91% | 75% | 75% | 82.35% |
| 81.25% ± 0.11 |
| Type 2 diabetes mellitus | 55.43% | 80% | 90% | 35.29% | 91% | 70.34% ± 0.24 |
| B cell receptor signalling pathway | 60.87% |
|
| 70.59% | 83.33% | 78.96% ± 0.133 |
| Toll like receptor signalling pathway | 48.91% | 70% | 80% |
| 83.33% | 74.1% ± 0.156 |
| Biosynthesis of unsaturated fatty acids | 55.43% | 80% | 65% |
| 75% | 72.73% ± 0.128 |
| Insulin signalling pathway | 59.78% |
| 85% | 58.82% |
| 77.72% ± 0.179 |
The best results for nine obtained biomarkers in each dataset are shown in boldface.
Enriched KEGG pathways of biomarker module
|
|
|
|---|---|
| Fc epsilon RI signaling pathway | 0.000177920 |
| Type 2 diabetes mellitus | 0.000616943 |
| B cell receptor signaling pathway | 0.002456893 |
| Progesterone mediated oocyte maturation | 0.003642362 |
| ERBB signaling pathway | 0.003764635 |
| Toll like receptor signalling pathway | 0.005905897 |
| Biosynthesis of unsaturated fatty acids | 0.036209447 |
| Mismatch repair | 0.002893819 |
| Neurotrophin signaling pathway | 0.010609564 |
| Insulin signaling pathway | 0.013322982 |