| Literature DB >> 28487748 |
SungHwan Kim1,2, Jae-Hwan Jhong3, JungJun Lee3, Ja-Yong Koo3, ByungYong Lee4, SungWon Han5.
Abstract
Up to date, many biological pathways related to cancer have been extensively applied thanks to outputs of burgeoning biomedical research. This leads to a new technical challenge of exploring and validating biological pathways that can characterize transcriptomic mechanisms across different disease subtypes. In pursuit of accommodating multiple studies, the joint Gaussian graphical model was previously proposed to incorporate nonzero edge effects. However, this model is inevitably dependent on post hoc analysis in order to confirm biological significance. To circumvent this drawback, we attempt not only to combine transcriptomic data but also to embed pathway information, well-ascertained biological evidence as such, into the model. To this end, we propose a novel statistical framework for fitting joint Gaussian graphical model simultaneously with informative pathways consistently expressed across multiple studies. In theory, structured nodes can be prespecified with multiple genes. The optimization rule employs the structured input-output lasso model, in order to estimate a sparse precision matrix constructed by simultaneous effects of multiple studies and structured nodes. With an application to breast cancer data sets, we found that the proposed model is superior in efficiently capturing structures of biological evidence (e.g., pathways). An R software package nsiGGM is publicly available at author's webpage.Entities:
Mesh:
Year: 2017 PMID: 28487748 PMCID: PMC5405575 DOI: 10.1155/2017/8520480
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Algorithm 1The structured alternating directions method of multipliers algorithm.
Performance comparisons of the nsiGGM with the JGGM and GGM using data simulated along with predefined module genes.
| Methods | # of noise genes | Sensitivity (s.e.) | Specificity (s.e.) | Youden (s.e.) |
|---|---|---|---|---|
| nsiGGM | 30 | 0.2217 (0.0253) | 0.9433 (0.0036) | 0.1650 (0.0257) |
| 40 | 0.2125 (0.0133) | 0.9472 (0.0053) | 0.1598 (0.0117) | |
| 50 | 0.2034 (0.019) | 0.9481 (0.0035) | 0.1515 (0.0175) | |
|
| ||||
| JGGM | 30 | 0.2433 (0.04) | 0.8685 (0.0273) | 0.1118 (0.0161) |
| 40 | 0.2815 (0.0418) | 0.8321 (0.0309) | 0.1136 (0.0146) | |
| 50 | 0.1920 (0.0425) | 0.8733 (0.0318) | 0.0653 (0.0124) | |
|
| ||||
| GGM | 30 | 0.2593 (0.0264) | 0.8325 (0.0214) | 0.0918 (0.0094) |
| 40 | 0.2752 (0.029) | 0.8050 (0.0257) | 0.0802 (0.0074) | |
| 50 | 0.2177 (0.0303) | 0.8431 (0.0268) | 0.0608 (0.0085) | |
Shown are the brief descriptions of the three data information pieces used in real genomic application.
| Study | Data type | # of samples | # of matched genes | Reference |
|---|---|---|---|---|
| Breast cancer | mRNA | 319 | 10,676 | The Cancer Genome Atlas (TCGA) |
| Breast cancer | mRNA | 134 | 10,676 | GSE7390 |
| Breast cancer | mRNA | 209 | 10,676 | GSE2034 |
The pathway sets from the Molecular Signatures Database (MSigDB) analyzed in the nsiGGM. (Note: asterisks represent pathway genes identified by only the nsiGGM not by JGGM.)
| Pathway 1: extracellular region (11 genes) | |
| | |
| Pathway 2: membrane part (11 genes) | |
| | |
| Pathway 3: membrane (14 genes) | |
| | |
| Pathway 4: cytoplasm (13 genes) | |
| | |
| Pathway 5: plasma membrane (12 genes) | |
| | |
| Pathway 6: system development (12 genes) | |
| | |
| Pathway 7: signal transduction (15 genes) | |
| | |
| Pathway 8: multicellular organismal development (15 genes) | |
| | |
| Pathway 9: cell signaling (11 genes) | |
| | |
| Pathway 10: anatomical structure development (13 genes) | |
| | |
| Pathway 11: organ development (11 genes) | |
| MSTN, |
Figure 1Three gene networks estimated by the nsiGGM. The detection rate of pathway genes is 0.573.
Figure 2Three gene networks estimated by the JGGM. The detection rate of pathway genes is 0.521.