| Literature DB >> 29684019 |
Daniel Lancour1,2, Adam Naj3, Richard Mayeux4, Jonathan L Haines5, Margaret A Pericak-Vance6, Gerard D Schellenberg3, Mark Crovella1,7, Lindsay A Farrer1,2,8,9,10,11, Simon Kasif1,12.
Abstract
Improving accuracy in genetic studies would greatly accelerate understanding the genetic basis of complex diseases. One approach to achieve such an improvement for risk variants identified by the genome wide association study (GWAS) approach is to incorporate previously known biology when screening variants across the genome. We developed a simple approach for improving the prioritization of candidate disease genes that incorporates a network diffusion of scores from known disease genes using a protein network and a novel integration with GWAS risk scores, and tested this approach on a large Alzheimer disease (AD) GWAS dataset. Using a statistical bootstrap approach, we cross-validated the method and for the first time showed that a network approach improves the expected replication rates in GWAS studies. Several novel AD genes were predicted including CR2, SHARPIN, and PTPN2. Our re-prioritized results are enriched for established known AD-associated biological pathways including inflammation, immune response, and metabolism, whereas standard non-prioritized results were not. Our findings support a strategy of considering network information when investigating genetic risk factors.Entities:
Mesh:
Year: 2018 PMID: 29684019 PMCID: PMC5933817 DOI: 10.1371/journal.pgen.1007306
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
GSEA results after ranking genes by GWAS only Z-scores.
| PATHWAY NAME | SIZE | ES | NES | FWER p-val |
|---|---|---|---|---|
| NAKAYAMA_SOFT_TISSUE_TUMORS_PCA1_UP | 61 | 0.440 | 2.108 | 0.134 |
| FARMER_BREAST_CANCER_CLUSTER_5 | 17 | 0.610 | 2.016 | 0.261 |
| KEGG_ANTIGEN_PROCESSING_AND_PRESENTATION | 64 | 0.397 | 1.950 | 0.418 |
| GOLUB_ALL_VS_AML_DN | 18 | 0.560 | 1.888 | 0.591 |
| CHIARETTI_T_ALL_REFRACTORY_TO_THERAPY | 23 | 0.499 | 1.879 | 0.608 |
| SHIN_B_CELL_LYMPHOMA_CLUSTER_5 | 15 | 0.541 | 1.772 | 0.864 |
| ZUCCHI_METASTASIS_DN | 35 | 0.423 | 1.769 | 0.873 |
| KIM_HYPOXIA | 22 | 0.456 | 1.704 | 0.964 |
| DELPUECH_FOXO3_TARGETS_DN | 37 | 0.400 | 1.672 | 0.985 |
| NIELSEN_LIPOSARCOMA_UP | 15 | 0.514 | 1.650 | 0.993 |
| ROVERSI_GLIOMA_COPY_NUMBER_UP | 56 | 0.351 | 1.643 | 0.996 |
RAD genes and the type of study that identified them.
| Chr. | Gene | Evidence | Chr. | Gene | Evidence | Chr. | Gene | Evidence |
|---|---|---|---|---|---|---|---|---|
| 1 | GWAS–AD [ | 7 | GWAS–AD [ | 12 | GWAS–endo [ | |||
| 1 | Linkage [ | 7 | GWAS–AD [ | 13 | GWAS–AD [ | |||
| 2 | GWAS–AD [ | 7 | GWAS–AD [ | 14 | GWAS–endo. [ | |||
| 2 | GWAS–AD [ | 8 | GWAS–AD [ | 14 | Linkage [ | |||
| 2 | WES [ | 8 | GWAS–AD [ | 14 | GWAS–AD [ | |||
| 3 | GWAS–endo [ | 8 | GWAS–AD [ | 14 | GWAS–endo. [ | |||
| 3 | GWAS–endo [ | 8 | GWAS–endo [ | 15 | GWAS–AD [ | |||
| 4 | WES [ | 9 | GWAS–endo [ | 16 | GWAS–AD [ | |||
| 4 | GWAS–endo [ | 9 | GWAS–endo | 17 | GWAS–AD [ | |||
| 5 | GWAS–AD [ | 10 | GWAS–AD [ | 17 | GWAS–AD [ | |||
| 5 | CGS [ | 10 | CGS [ | 17 | GWAS–AD [ | |||
| 5 | GWAS–AD [ | 10 | CGS [ | 17 | CGS [ | |||
| 6 | GWAS–AD [ | 11 | GWAS–AD [ | 19 | GWAS–AD [ | |||
| 6 | WES [ | 11 | GWAS–AD [ | 19 | WES [ | |||
| 6 | GWAS–endo [ | 11 | GWAS–AD [ | 19 | Linkage [ | |||
| 6 | GWAS–AD [ | 11 | GWAS–AD [ | 19 | GWAS–AD [ | |||
| 6 | GWAS–AD [ | 11 | GWAS–AD [ | 20 | GWAS–AD [ | |||
| 7 | GWAS–AD [ | 11 | GWAS–AD [ | 21 | Targeted Seq. [ | |||
| 7 | WES [ | 11 | CGS [ | 21 | GWAS–endo. [ | |||
| 7 | GWAS–AD [ | 11 | GWAS–endo [ |
GWAS = genome-wide association study, linkage = family-based linkage study, endo. = AD-related endophenotype, CGS = candidate gene study, WES = whole exome sequencing, target seq. = targeted gene resequencing. Genes that are highlighted in bold text met more stringent criteria and were included in the conservative set of RAD genes.
Proximity between RAD genes in PPI network.
Each RAD gene was ranked (in comparison to the other 19,972 genes in the network) based upon its degree (number of interactions in network), its ASP distance to the RAD genes, and total diffusion distance from the RAD genes. The average ranking of the RAD genes was 7,949 using ASP (60th percentile, t-test p = 0.015) and 6,959 for diffusion (65th percentile, t-test p = 0.00054).
| Gene | Rank | Gene | Rank | Gene | Rank | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Degree | ASP | Diffusion | Degree | ASP | Diffusion | Degree | ASP | Diffusion | |||
| APP | 2 | 2 | 1248 | MEF2C | 3012.5 | 3072.5 | 2619 | SORCS2 | 12984 | 14902.5 | 1081 |
| CASP8 | 238.5 | 76 | 754 | ABI3 | 3012.5 | 10739 | 3228 | SORCS3 | 14153 | 16106.5 | 1170 |
| PSEN1 | 558.5 | 119.5 | 441 | SORL1 | 4372.5 | 9964 | 2675 | ABCG1 | 14153 | 7689.5 | 16627 |
| MAPT | 600.5 | 9 | 342 | TPBG | 4516.5 | 4551.5 | 5100 | TP53INP1 | 14153 | 11727 | 10975 |
| PTK2B | 800 | 175 | 670 | PDGFRL | 4862 | 13192.5 | 7434 | PLXNA4 | 14153 | 15296.5 | 14933 |
| CLU | 883 | 785 | 1935 | LMX1B | 5236.5 | 10441.5 | 7905 | KCNMB2 | 15703.5 | 11038.5 | 12216 |
| PFDN1 | 930.5 | 2268 | 4465 | HLA-DRB5 | 5666.5 | 4554 | 7104 | SORCS1 | 15703.5 | 17153 | 9425 |
| CD2AP | 1043.5 | 2275.5 | 585 | CD33 | 5666.5 | 2281.5 | 1682 | MS4A6A | 15703.5 | 19883.5 | 19955 |
| PSEN2 | 1188 | 454 | 642 | PLD3 | 5891.5 | 4554.5 | 4320 | ABCA7 | 15703.5 | 7689.5 | 17609 |
| AKAP9 | 1230 | 4547.5 | 2996 | CELF1 | 5891.5 | 789 | 3793 | SRRM4 | 18290 | 18462.5 | 18934.5 |
| PLCG2 | 1255 | 281 | 868 | PILRA | 6640.5 | 13274.5 | 8762 | CASS4 | 18290 | 14847.5 | 16647.5 |
| APOE | 1517 | 283 | 626 | CR1 | 7296.5 | 15652 | 12460 | ECHDC3 | 18290 | 19700.5 | 19390 |
| INPP5D | 1582 | 455 | 795 | GALNT7 | 7296.5 | 7688 | 8782 | PLD4 | 18290 | 7689.5 | 17433 |
| BIN1 | 1691 | 457 | 977 | MVB12B | 7995.5 | 7688.5 | 4498 | TREM2 | 18290 | 19587 | 1566 |
| TRIP4 | 2509 | 4548.5 | 5679 | ACE | 8878 | 4555 | 9212 | SLC10A2 | 18290 | 7689.5 | 17128 |
| PICALM | 2640 | 3070.5 | 1207 | EPHA1 | 9380.5 | 7689 | 8437 | ZNF804B | 18290 | 18465 | 18406 |
| KANSL1 | 2780 | 3069.5 | 3734 | COBL | 9928.5 | 13930 | 9416 | NCR2 | 18290 | 19587 | 1566 |
| FERMT2 | 2857.5 | 1496.5 | 3313 | UNC5C | 12984 | 14796.5 | 15064 | ||||
Proximity of non-RAD hub genes to RAD genes.
| Rank | |||
|---|---|---|---|
| Gene | Degree | ASP | Diffusion |
| UBC | 1 | 1 | 1433 |
| SUMO2 | 2 | 20.5 | 1570 |
| CUL3 | 3 | 51 | 2515 |
| SUMO1 | 4 | 20.5 | 1502 |
| EGFR | 5.5 | 3 | 937 |
| TP53 | 5.5 | 7 | 983 |
| GRB2 | 7 | 2 | 905 |
| SUMO3 | 8 | 181 | 2433 |
| HSP90AA1 | 9 | 10 | 978 |
| MDM2 | 10 | 51 | 1096 |
Top predicted AD genes using combination approach.
| Z-Score | |||
|---|---|---|---|
| Gene | GWAS | Network | Combined |
| CR2 | 4.084 | 2.832 | 4.857 |
| SHARPIN | 3.983 | 1.320 | 4.185 |
| PTPN2 | 3.805 | 1.259 | 3.997 |
| C4B | 2.846 | 2.928 | 3.750 |
| TUBB2B | 3.166 | 1.314 | 3.428 |
| EPS8 | 3.156 | 1.156 | 3.358 |
| PSMC3 | 3.145 | 1.036 | 3.302 |
| STRAP | 3.051 | 1.157 | 3.262 |
| HSPA2 | 2.977 | 1.325 | 3.258 |
| STUB1 | 2.895 | 1.407 | 3.213 |
GSEA results after ranking genes by combined Z-scores.
| PATHWAY NAME | SIZE | ES | NES | FWER p-val |
|---|---|---|---|---|
| KEGG_ANTIGEN_PROCESSING_AND_PRESENTATION | 64 | 0.487 | 2.231 | 0.042 |
| DELPUECH_FOXO3_TARGETS_DN | 37 | 0.527 | 2.181 | 0.07 |
| BIOCARTA_PGC1A_PATHWAY | 20 | 0.613 | 2.180 | 0.071 |
| KEGG_SYSTEMIC_LUPUS_ERYTHEMATOSUS | 85 | 0.436 | 2.171 | 0.073 |
| MURAKAMI_UV_RESPONSE_6HR_DN | 20 | 0.592 | 2.124 | 0.117 |
| GOLUB_ALL_VS_AML_DN | 18 | 0.629 | 2.118 | 0.127 |
| REACTOME_RNA_POL_I_PROMOTER_OPENING | 28 | 0.552 | 2.100 | 0.149 |
| MODY_HIPPOCAMPUS_PRENATAL | 36 | 0.519 | 2.098 | 0.153 |
| FARMER_BREAST_CANCER_CLUSTER_5 | 17 | 0.632 | 2.090 | 0.161 |
| ZUCCHI_METASTASIS_DN | 35 | 0.516 | 2.067 | 0.197 |
| NAKAYAMA_SOFT_TISSUE_TUMORS_PCA1_UP | 61 | 0.456 | 2.058 | 0.205 |
| INGA_TP53_TARGETS | 15 | 0.635 | 2.049 | 0.222 |