| Literature DB >> 32894123 |
Xiangdong Gu1, Mahlet G Tadesse2, Andrea S Foulkes3, Yunsheng Ma4, Raji Balasubramanian5.
Abstract
BACKGROUND: The onset of silent diseases such as type 2 diabetes is often registered through self-report in large prospective cohorts. Self-reported outcomes are cost-effective; however, they are subject to error. Diagnosis of silent events may also occur through the use of imperfect laboratory-based diagnostic tests. In this paper, we describe an approach for variable selection in high dimensional datasets for settings in which the outcome is observed with error.Entities:
Keywords: Bayesian variable selection; High dimensional data; Self-reports
Mesh:
Year: 2020 PMID: 32894123 PMCID: PMC7487595 DOI: 10.1186/s12911-020-01223-w
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Probability of ranking among the top five SNPs by posterior probability of inclusion, for SNPs that are associated with outcome. CIR denotes the cumulative incidence rate in the reference group, RSF denotes Random Survival Forests, BVS denotes the proposed BVS algorithm assuming perfect self-reports and BVS denotes the proposed BVS procedure. NTFP denotes a study design setting in which all self-reports following the first positive result are discarded and NMISS denotes the setting where there are no missed visits
| CIR | sensitivity ( | specificity ( | RSF | BVS | BVS | BVS |
|---|---|---|---|---|---|---|
| NTFP | NMISS | |||||
| 0.1 | 1 | 1 | 0.70(±0.01) | 0.87(±0.01) | 0.87(±0.01) | 0.87(±0.01) |
| 0.1 | 1 | 0.9 | 0.41(±0.01) | 0.34(±0.01) | 0.30(±0.01) | 0.81(±0.01) |
| 0.1 | 0.75 | 1 | 0.68(±0.01) | 0.81(±0.01) | 0.84(±0.01) | 0.84(±0.01) |
| 0.1 | 0.61 | 0.995 | 0.63(±0.01) | 0.69(±0.01) | 0.74(±0.01) | 0.75(±0.01) |
| 0.3 | 1 | 1 | 0.80(±0.01) | 0.98(±0.01) | 0.98(±0.01) | 0.98(±0.01) |
| 0.3 | 1 | 0.9 | 0.58(±0.01) | 0.63(±0.01) | 0.68(±0.01) | 0.97(±0.01) |
| 0.3 | 0.75 | 1 | 0.78(±0.01) | 0.90(±0.01) | 0.95(±0.01) | 0.95(±0.01) |
| 0.3 | 0.61 | 0.995 | 0.74(±0.01) | 0.82(±0.01) | 0.88(±0.01) | 0.88(±0.01) |
Rankings of individual SNPs in the WHI Clinical Trial and Observational Study SHARe among African American women in the WHI (n=6704). Results from the following analyses are reported: (1) The proposed BVS approach (BVS); (2) BVS assuming perfect tests (BVS); (3) univariate (SNP by SNP) analysis assuming a Cox PH model; and (4) univariate analysis adjusting for error in self-report (icensmis). Each analysis simultaneously adjusted for the top two principal components to account for population stratification. SNPs are ordered from most (rank= 1) to least important (rank >1000) with regard to their association with time to incident type 2 diabetes. Ranks >1000 are denoted by −
| BVS | BVS | Cox PH | rs | Intron | Upstream | Downstream | |
|---|---|---|---|---|---|---|---|
| Rank | Rank | Rank | Rank | Number | |||
| 1 | 2 | 1 | 1 | rs2805434 | RYR2 | ||
| 2 | - | 15 | 10 | rs5946729 | SHOX, CRLF2 | ||
| 3 | - | 28 | 20 | rs10126793 | PDK3 | SUPT20HL1 | |
| 4 | - | 3 | 3 | rs10054129 | RXFP3, ADAMTS12 | ||
| 5 | - | 22 | 16 | rs7523871 | RNU5F-1 | LOC101929689, LYPLAL1 | |
| 6 | - | - | - | rs6795523 | IGSF11 | ||
| 7 | - | - | - | rs149091 | ANKRD55 | LOC102467147 | |
| 8 | - | 81 | 69 | rs10820848 | LOC101927847 | UNQ6494 | |
| 9 | - | - | - | rs10950835 | SP4 | RPL23P8 | |
| 10 | - | - | - | rs2714365 | CHST9 | CDH2 | |
| 34 | - | 9 | 9 | rs17693218 | LYZL1 | C10orf126 | |
| - | - | 2 | 2 | rs2805429 | RYR2 | ||
| - | - | 5 | 5 | rs7737188 | RXFP3, ADAMTS12 | ||
| - | - | 8 | 7 | rs16917265 | INIP, SNX30 | ||
| - | - | 6 | 6 | rs4144636 | ASTN2 | ||
| - | - | 7 | 8 | rs12247963 | FAM188A | ||
| - | - | 4 | 4 | rs15958 | |||
| - | - | 10 | 17 | rs1959083 | LINC00520, RPL13AP3 | ||
| - | 1 | 13 | 19 | rs6573059 | LINC00520, RPL13AP3 |
Rankings of individual SNPs in the WHI Clinical Trial and Observational Study SHARe among Hispanic American women in the WHI (n=3169). Results from the following analyses are reported: (1) The proposed BVS approach (BVS); (2) BVS assuming perfect tests (BVS); (3) univariate (SNP by SNP) analysis assuming a Cox PH model; and (4) univariate analysis adjusting for error in self-report (icensmis). Each analysis simultaneously adjusted for the top two principal components to account for population stratification. SNPs are ordered from most (rank= 1) to least important (rank >1000) with regard to their association with time to incident type 2 diabetes. Ranks >1000 are denoted by −
| BVS | BVS | Cox PH | rs | Intron | Upstream | Downstream | |
|---|---|---|---|---|---|---|---|
| Rank | Rank | Rank | Rank | Number | |||
| 1 | - | 4 | 2 | rs6547248 | CTNNA2 | ||
| 2 | - | 22 | 28 | rs488672 | TRIM29 | TRIM29, OAF | |
| 3 | - | 19 | 24 | rs1396128 | LINC00968 | IMPAD1 | |
| 4 | - | - | - | rs2964611 | GLRA1, G3BP1 | ||
| 5 | - | - | - | rs519206 | LYZL1 | C10orf126 | |
| 6 | - | 3 | 7 | rs6079637 | MACROD2 | ||
| 7 | - | - | - | rs4778193 | OCA2 | ||
| 8 | - | - | - | rs1972897 | PARD3 | ||
| 9 | - | - | - | rs10202023 | LOC101927196 | ||
| 10 | - | 9 | 8 | rs6899814 | RNY4 | SH3BGRL2, C6orf7 | |
| 13 | - | 14 | 10 | rs17253815 | GPC6 | ||
| 29 | - | 5 | 1 | rs17175231 | GPC6 | ||
| 86 | - | 1 | 3 | rs6135332 | MACROD2 | ||
| - | - | 6 | 6 | rs276637 | SHISA9 | ||
| - | - | 2 | 4 | rs10242930 | TMEM106B, THSD7A | ||
| - | - | 7 | 5 | rs10809502 | TYRP1 | PTPRD-AS2 | |
| - | - | 10 | 9 | rs1561955 | TYRP1 | PTPRD-AS2 | |
| - | - | 8 | 12 | rs6079638 | MACROD2 | ||
| - | 1 | - | - | rs9610221 | HMGXB4 | ISX |