| Literature DB >> 32555747 |
Linhui Xie1, Pradeep Varathan2, Kwangsik Nho3, Andrew J Saykin3, Paul Salama1, Jingwen Yan3,2.
Abstract
Large-scale genome wide association studies (GWASs) have led to discovery of many genetic risk factors in Alzheimer's disease (AD), such as APOE, TOMM40 and CLU. Despite the significant progress, it remains a major challenge to functionally validate these genetic findings and translate them into targetable mechanisms. Integration of multiple types of molecular data is increasingly used to address this problem. In this paper, we proposed a modularity-constrained Lasso model to jointly analyze the genotype, gene expression and protein expression data for discovery of functionally connected multi-omic biomarkers in AD. With a prior network capturing the functional relationship between SNPs, genes and proteins, the newly introduced penalty term maximizes the global modularity of the subnetwork involving selected markers and encourages the selection of multi-omic markers with dense functional connectivity, instead of individual markers. We applied this new model to the real data collected in the ROS/MAP cohort where the cognitive performance was used as disease quantitative trait. A functionally connected subnetwork involving 276 multi-omic biomarkers, including SNPs, genes and proteins, were identified to bear predictive power. Within this subnetwork, multiple trans-omic paths from SNPs to genes and then proteins were observed. This suggests that cognitive performance deterioration in AD patients can be potentially a result of genetic variations due to their cascade effect on the downstream transcriptome and proteome level.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32555747 PMCID: PMC7299377 DOI: 10.1371/journal.pone.0234748
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Demographic information of the ROS/MAP participants included in this study.
| Dignosis | CN | MCI | AD |
|---|---|---|---|
| Subject Number | 115 | 67 | 80 |
| ROS/MAP | 69/46 | 27/40 | 40/40 |
| Male/Female | 51/64 | 28/39 | 31/49 |
| Education(mean± std.) | 16.9 ± 3.5 | 16.6 ± 3.3 | 16.9 ± 3.8 |
| Age(mean± std.) | 83.0 ± 4.7 | 85.0 ± 4.2 | 86.3 ± 3.7 |
Fig 1The selection of SNPs in upstream 5K boundary for each gene.
Performance comparison on test set between M-Lasso and other methods.
| fold 1 | fold 2 | fold 3 | fold 4 | fold 5 | Mean | ||
|---|---|---|---|---|---|---|---|
| RMSE | M-Lasso | 1.082 | 0.959 | ||||
| G-Lasso | 0.876 | 1.128 | 1.019 | 1.178 | 1.415 | 1.123 | |
| Elastic Net | 0.871 | 1.073 | 0.936 | 0.968 | 1.577 | 1.085 | |
| Lasso | 0.852 | 0.931 | 2.259 | 1.207 | |||
| MAE | M-Lasso | 0.872 | |||||
| G-Lasso | 0.726 | 0.893 | 0.751 | 0.881 | 0.872 | 0.825 | |
| Elastic Net | 0.743 | 0.861 | 0.687 | 0.757 | 0.857 | 0.781 | |
| Lasso | 0.726 | 0.686 | 0.758 | 0.992 | 0.799 | ||
1 RMSE: root mean square error;
2 MAE: mean absolute error.
Fig 2Top 7 connected components with biomarkers identified using M-Lasso mapped to prior network.
Inside the red box is the subnetwork involving APOE gene, a top risk factor of Alzheimer’s disease.
Nodes with top centrality values in the largest connected subnetwork.
| Degree | |
| Average Shortest Path Length | |
| Betweenness | |
| Closeness |
1 rs- IDs: SNPs; Bold: genes; The rest: proteins.
Top enriched KEGG pathways by the genes and proteins in the largest connected subnetwork.
| Pathway | # of markers in the pathway | # of genes in the pathway | p-value | Corrected p-value |
|---|---|---|---|---|
| PI3K-Akt signaling pathway | 64 | 354 | 2.24E-36 | 3.39E-34 |
| Focal adhesion | 41 | 199 | 2.81E-25 | 4.22E-23 |
| Pathways in cancer | 59 | 530 | 8.78E-22 | 1.31E-19 |
| ECM-receptor interaction | 25 | 82 | 4.04E-20 | 5.97E-18 |
| Human papillomavirus infection | 42 | 330 | 1.54E-17 | 2.26E-15 |
| Ras signaling pathway | 34 | 232 | 3.65E-16 | 5.34E-14 |
| Small cell lung cancer | 22 | 93 | 3.12E-15 | 4.53E-13 |
| MAPK signaling pathway | 36 | 295 | 1.65E-14 | 2.38E-12 |
| Kaposi sarcoma-associated herpesvirus infection | 27 | 186 | 6.77E-13 | 9.67E-11 |
| Toxoplasmosis | 21 | 113 | 2.20E-12 | 3.13E-10 |
| Prostate cancer | 18 | 97 | 9.14E-11 | 1.29E-08 |
| Chronic myeloid leukemia | 16 | 76 | 1.45E-10 | 2.03E-08 |
| Phospholipase D signaling pathway | 21 | 148 | 4.49E-10 | 6.25E-08 |
| Amoebiasis | 17 | 96 | 6.70E-10 | 9.25E-08 |
| Human cytomegalovirus infection | 25 | 225 | 1.83E-09 | 2.51E-07 |
| Relaxin signaling pathway | 19 | 130 | 1.91E-09 | 2.60E-07 |
| Acute myeloid leukemia | 14 | 66 | 1.93E-09 | 2.60E-07 |
| Proteoglycans in cancer | 23 | 201 | 4.84E-09 | 6.49E-07 |
| ErbB signaling pathway | 15 | 85 | 7.42E-09 | 9.87E-07 |
| Rap1 signaling pathway | 23 | 206 | 7.81E-09 | 1.03E-06 |