| Literature DB >> 22373203 |
Yauheniya Cherkas1, Nandini Raghavan, Stephan Francke, Frank Defalco, Marsha A Wilcox.
Abstract
In addition to methods that can identify common variants associated with susceptibility to common diseases, there has been increasing interest in approaches that can identify rare genetic variants. We use the simulated data provided to the participants of Genetic Analysis Workshop 17 (GAW17) to identify both rare and common single-nucleotide polymorphisms and pathways associated with disease status. We apply a rare variant collapsing approach and the usual association tests for common variants to identify candidates for further analysis using pathway-based and tree-based ensemble approaches. We use the mean log p-value approach to identify a top set of pathways and compare it to those used in simulation of GAW17 dataset. We conclude that the mean log p-value approach is able to identify those pathways in the top list and also related pathways. We also use the stochastic gradient boosting approach for the selected subset of single-nucleotide polymorphisms. When compared the result of this tree-based method with the list of single-nucleotide polymorphisms used in dataset simulation, in addition to correct SNPs we observe number of false positives.Entities:
Year: 2011 PMID: 22373203 PMCID: PMC3287936 DOI: 10.1186/1753-6561-5-S9-S94
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Plot of the first two components of multidimensional scaling
Figure 2Manhattan plot of collapsing approach p-values for 5-kb sliding window
Major IPA pathways identified by the MLP approach using five statistics for rare and common variants and their corresponding p-values
| Statistic | Top 6 pathways selected | |
|---|---|---|
| Rare variants | ||
| Gene-wise collapsing | 1. Androgen and estrogen metabolism | 0.0006 |
| 2. Sphingolipid metabolism | 0.0006 | |
| 3. Phenylalanine metabolism | 0.0015 | |
| 4. Death receptor signaling | 0.002 | |
| 5. Stilbene, coumarine, and lignin biosynthesis | 0.0036 | |
| 6. TWEAK signaling | 0.0043 | |
| Minimum | 1. Notch signaling | 0.0248 |
| 2. Hypoxia signaling in the cardiovascular system | 0.0403 | |
| 3. Nitric oxide signaling in the cardiovascular system | 0.0409 | |
| 4. VEGF signaling | 0.0585 | |
| 5. Glutamate receptor signaling | 0.0642 | |
| 6. Glutamate metabolism | 0.0741 | |
| Mean log | 1. Cyanoamino acid metabolism | 0.0032 |
| 2. Ubiquinone biosynthesis | 0.0148 | |
| 3. Nitrogen metabolism | 0.0267 | |
| 4. Alanine and aspartate metabolism | 0.0392 | |
| 5. GABA receptor signaling | 0.0423 | |
| 6. FXR/RXR activation | 0.0438 | |
| Common variants | ||
| Minimum | 1. Apoptosis signaling | 0.0083 |
| 2. Pyrimidine metabolism | 0.0232 | |
| 3. CNTF signaling | 0.0429 | |
| 4. FLT3 signaling in hematopoietic progenitor cells | 0.0601 | |
| 5. Role of NANOG in mammalian embryonic stem cell pluripotency | 0.0847 | |
| 6. EGF signaling | 0.0908 | |
| Mean log p-value for SNPs in a gene | 1. Pyrimidine metabolism | 0.0021 |
| 2. CNTF signaling | 0.0127 | |
| 3. Melanocyte development and pigmentation signaling | 0.0221 | |
| 4. JAK/Stat signaling | 0.0327 | |
| 5. IL-15 signaling | 0.0356 | |
| 6. FLT3 signaling in hematopoietic progenitor cells | 0.045 | |
Top SNPs and corresponding genes identified using the SGB approach
| Top SNPs identified (from highest to lowest variable importance) | ||||||||
|---|---|---|---|---|---|---|---|---|
| SNPs | C9S3621 | C6S6142 | C5S237 | C9S4860 | C22S1374 | C2S5630 | C11S60 | |
| C13S905 | C7S2893 | C19S282 | C6S1097 | C10S2632 | C2S955 | C14S784 | ||
| C1S5779 | C7S2446 | C9S1469 | C12S4668 | C2S1087 | C2S2148 | C1S10506 | C6S2129 | |
| C15S3343 | C12S4188 | C14S1863 | C2S4601 | C6S2469 | C10S2533 | C19S609 | C10S5515 | |
| C1S5530 | C17S3017 | C9S1225 | C12S3028 | C5S3461 | C19S1762 | C1S9584 | ||
| C19S1849 | C9S4013 | C22S1405 | C12S622 | C12S7056 | C2S7558 | C16S3421 | C12S552 | |
| C2S4407 | C1S996 | C22S1351 | C20S2310 | C22S1158 | C15S4060 | C17S1262 | C3S1305 | |
| C7S158 | C10S387 | C17S2377 | C7S1877 | C1S9718 | C10S4422 | C4S2872 | C7S3971 | |
| C2S689 | C8S3322 | C10S6566 | C14S20 | C7S1076 | C11S3224 | C1S7413 | C22S146 | |
| C8S4238 | C8S4028 | C18S2322 | C6S6040 | C12S5220 | C6S6177 | C19S3382 | C19S2528 | |
| C1S9506 | C4S4283 | C12S3528 | C11S2585 | C17S2376 | C12S5446 | C17S4841 | C1S10200 | |
| C4S2239 | C7S3613 | C5S4072 | C11S6503 | C11S4881 | C1S10800 | C9S123 | C2S7414 | |
| C2S1139 | C3S3962 | C7S3490 | C10S5783 | C11S1683 | C9S2613 | C11S2532 | C7S4111 | |
| C18S2310 | C2S4079 | C6S2366 | C8S627 | C2S6985 | C1S7941 | C4S4339 | ||
| C3S3938 | C22S875 | C1S7092 | C7S2590 | C11S2871 | C6S2216 | C6S5677 | ||
| C7S4646 | C8S850 | C8S271 | C4S2296 | C10S386 | C9S5111 | C15S3138 | C1S7427 | |
| C17S3510 | C3S96 | C22S385 | C1S3900 | C3S4638 | C21S672 | C1S1388 | C10S2683 | |
| C13S1168 | C7S3697 | C2S4909 | C11S1280 | C2S2154 | C12S4591 | C3S1176 | C22S2039 | |
| C11S3320 | C2S873 | C9S3100 | C2S7390 | C12S5526 | C11S1599 | C6S4552 | C1S10256 | |
| C10S3243 | C12S5510 | C4S2678 | C4S2970 | C2S8207 | C16S560 | C6S7138 | C17S321 | |
| C20S1844 | C12S5445 | C10S2670 | C1S4009 | C17S2026 | C9S3554 | C13S1660 | C14S590 | |
| C10S6432 | C9S759 | C19S4625 | C1S9511 | C8S934 | C6S4242 | C18S1560 | C4S97 | |
| C15S3744 | C7S397 | C19S4658 | C12S4534 | C9S2083 | C19S5271 | C7S3898 | C1S4838 | |
| C9S1607 | C4S3834 | C11S5644 | C15S2848 | C10S3777 | C3S3657 | C14S122 | C14S3426 | |
| C16S1482 | C4S3076 | C9S2542 | C2S6995 | C21S778 | C9S1835 | C15S3559 | C8S3416 | |
| C5S2032 | C1S10813 | C10S5690 | C1S3676 | C6S4400 | C13S163 | C22S645 | C12S3039 | |
| C1S10164 | C6S7164 | C22S1222 | C4S649 | C19S277 | C1S7408 | C1S1542 | C4S186 | |
| Genes | ||||||||
Boldface indicates results that match the simulated model.