| Literature DB >> 34027015 |
Yi Li1, Annat Haber1, Christoph Preuss2, Cai John1, Asli Uyar1, Hongtian Stanley Yang2, Benjamin A Logsdon3, Vivek Philip2, R Krishna Murthy Karuturi1, Gregory W Carter1,2.
Abstract
INTRODUCTION: Genome-wide association studies (GWAS) for late onset Alzheimer's disease (AD) may miss genetic variants relevant for delineating disease stages when using clinically defined case/control as a phenotype due to its loose definition and heterogeneity.Entities:
Keywords: Alzheimer's disease; convolutional neural networks; deep learning; disease progression; imaging phenotypes; machine learning; magnetic resonance imaging; transfer learning
Year: 2021 PMID: 34027015 PMCID: PMC8120261 DOI: 10.1002/dad2.12140
Source DB: PubMed Journal: Alzheimers Dement (Amst) ISSN: 2352-8729
FIGURE 1Graphic summary of the analytical approach. AMP‐AD, Accelerating Medicines Partnership‐Alzheimer's Disease; APOE, apolipoprotein E; CNN, convolutional neural network; GWAS, genome‐wide association studies; MRI, magnetic resonance imaging
Demographic assessment and APOE ε4 genotype distribution in ADNI and AIBL data
| ADNI data | AIBL data | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No. of APOE ε4 copies | No. of APOE ε4 copies | ||||||||||||
| No. of subjects | Age | Male/female | 0 | 1 | 2 | No. of subjects | Age | Male/female | 0 | 1 | 2 | MMSE | |
| Control | 373 | 74.3 ± 6 | 182/191 | 274 | 91 | 9 | 107 | 70.8 ± 7 | 51/56 | 76 | 30 | 1 | 29.1 ± 1.1 |
| AD | 251 | 74.8 ± 8 | 134/117 | 84 | 114 | 53 | 74 | 73.2 ± 8 | 29/45 | 23 | 37 | 14 | 20.3 ± 5.6 |
| sMCI | 424 | 73.1 ± 8 | 255/169 | 246 | 141 | 37 | 10 | 77.2 ± 7 | 8/2 | 5 | 4 | 1 | 28.0 ± 1.5 |
| pMCI | 230 | 73.9 ± 7 | 134/96 | 77 | 114 | 39 | 11 | 74.9 ± 6 | 7/4 | 1 | 6 | 4 | 26.3 ± 1.7 |
|
| 0.047 | 0.26 | 5.72 × 10–30 | 0.28 | 0.26 | 4.51 × 10–10 | 2.93 × 10–16 | ||||||
|
| 0.72 | 0.64 | 3.21 × 10–09 | 0.25 | 0.43 | 0.033 | 0.18 | ||||||
Notes: Age is presented in a mean ± standard deviation format.
ADNI, sMCI, and pMCI were estimated until 3 years from screening (for CNN training); AIBL, sMCI, and pMCI were estimated until 6 years from baseline (for CNN evaluation).
P‐value1: P value of comparing AD and controls.
P‐value2: P value of comparing sMCI and pMCI.
Abbreviations: AD, Alzheimer's disease; ADNI, Alzheimer's Disease Neuroimaging Initiative; AIBL, Australian Imaging, Biomarker & Lifestyle Flagship Study of Ageing; APOE, apolipoprotein E; MMSE, Mini‐Mental State Examination; pMCI, progressive MCI; sMCI, stable MCI.
Confusion matrix for CNN predictions
|
|
| |||||
|---|---|---|---|---|---|---|
| Clinical label | Control | Stable MCI | Broad AD | Control | Stable MCI | Broad AD |
| 3Y_ctrl (373) | 0.895 (334) | 0.0107 (4) | 0.0938 (35) | 0.96 (358) | 0.0268 (10) | 0.0134 (5) |
| AD (482) | 0.0851 (41) | 0.00207 (1) | 0.913 (440) | 0 (0) | 0.0083 (4) | 0.992 (478) |
| 1Y_sMCI (425) | 0.442 (188) | 0.181 (77) | 0.376 (160) | 0.231 (98) | 0.36 (153) | 0.409 (174) |
| 2Y_sMCI (336) | 0.518 (174) | 0.226 (76) | 0.256 (86) | 0.277 (93) | 0.435 (146) | 0.289 (97) |
| 3Y_sMCI (296) | 0.551 (163) | 0.24 (71) | 0.209 (62) | 0.297 (88) | 0.476 (141) | 0.226 (67) |
| 4Y_sMCI (278) | 0.558 (155) | 0.241 (67) | 0.201 (56) | 0.317 (88) | 0.478 (133) | 0.205 (57) |
| 5Y_sMCI (271) | 0.561 (152) | 0.232 (63) | 0.207 (56) | 0.325 (88) | 0.48 (130) | 0.196 (53) |
| final_sMCI (255) | 0.557 (142) | 0.243 (62) | 0.2 (51) | 0.318 (81) | 0.49 (125) | 0.192 (49) |
| 1Y_pMCI (101) | 0.178 (18) | 0 (0) | 0.822 (83) | 0.0198 (2) | 0.0693 (7) | 0.911 (92) |
| 2Y_pMCI (190) | 0.195 (37) | 0.00526 (1) | 0.8 (152) | 0.0368 (7) | 0.0737 (14) | 0.889 (169) |
| 3Y_pMCI (230) | 0.209 (48) | 0.0087 (2) | 0.783 (180) | 0.0522 (12) | 0.0826 (19) | 0.865 (199) |
| 4Y_pMCI (248) | 0.226 (56) | 0.0403 (10) | 0.734 (182) | 0.0484 (12) | 0.109 (27) | 0.843 (209) |
| 5Y_pMCI (255) | 0.231 (59) | 0.0549 (14) | 0.714 (182) | 0.0471 (12) | 0.118 (30) | 0.835 (213) |
| final_pMCI (271) | 0.255 (69) | 0.0554 (15) | 0.69 (187) | 0.0701 (19) | 0.129 (35) | 0.801 (217) |
Notes: Number of samples is given in parentheses. The fraction at each entry stands for the ratio of the number of CNN predictions belonging to the column category to the number of samples belonging to the row (clinical) category. That pMCI or AD were predicted as broad AD, and control/sMCI were predicted as non broad AD can be viewed as correct predictions in a broad sense.
Abbreviations: AD, Alzheimer's disease; ADNI, Alzheimer's Disease Neuroimaging Initiative; APOE, apolipoprotein E; CNN, convolutional neural network; pMCI, progressive MCI; sMCI, stable MCI.
FIGURE 2Average AUC (area under the receiver operating characteristic curve) of predicting stable and progressive mild cognitive impairment (MCI) among 10 sample splits for six follow‐up periods, comparing registered and non‐registered images. A, Image CNN model on test samples (with error bar). B, Augmented CNN model on test samples (with error bar). C, Comparison of average AUC between training and test samples. CNN, convolutional neural network; pMCI, progressive mild cognitive impairment; sMCI, stable mild cognitive impairment
Genome‐wide association results based on the two sets of CNN‐derived phenotypes
| Lead SNP | Image phenotype | Chr | Position (hg19) | A1 | A2 | AF | GMMAT score (A1) | GMMAT standard error |
| SNP type | eQTL genes in AMP‐AD | Nearby genes (±15 KB) | AMP‐AD logFC | Other associated phenotypes [ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| ||||||||||||||
| rs11558606 | PC9 | 1 | 230814668 | A | G | 0.072 | 5.243 | 1.153 | 5.41 × 10–6 | missense |
| 0.096 | edu | |
| rs6672949 | PC4 | 1 | 37985911 | C | T | 0.361 | ‐12.860 | 2.673 | 1.51 × 10–6 | intergenic |
| ‐0.222 | NA | |
| rs34707417 | PC9 | 2 | 71708810 | T | G | 0.236 | 8.605 | 1.906 | 6.38 × 10–6 | intronic |
| NA | NA | |
| rs12361440 | PC9 | 11 | 74396631 | G | A | 0.151 | ‐12.19 | 2.543 | 1.63 × 10–6 | intergenic |
| NA | NA | |
| rs11062078 | PC4 | 12 | 2135487 | C | T | 0.26 | ‐11.17 | 2.397 | 3.14 × 10–6 | intergenic |
| ‐0.089 | edu, high | |
| rs35047 | PC9 | 12 | 31163186 | G | T | 0.136 | 11.516 | 2.531 | 5.36 × 10–6 | intergenic |
|
| 0.466 | NA |
| rs12588868 | PC9 | 14 | 92909309 | T | C | 0.499 | 16.114 | 3.631 | 9.07 × 10–6 | intronic |
| NA | edu | |
| rs429498 | PC4 | 19 | 56455746 | G | A | 0.441 | ‐12.830 | 2.664 | 1.46 × 10–6 | intergenic |
|
| NA | NA |
| rs8115712 | PC4 | 20 | 4746960 | G | A | 0.328 | 11.438 | 2.535 | 6.43 × 10–6 | intergenic |
| NA | NA | |
| rs8116731 | PC9 | 20 | 16718784 | A | G | 0.114 | ‐11.820 | 2.271 | 1.96 × 10–7 | intronic |
| ‐0.132 | NA | |
| rs117100735 | PC4 | 20 | 18041336 | G | T | 0.063 | 4.461 | 1.005 | 8.97 × 10–6 | intergenic |
| NA | NA | |
| rs35278766 | PC4 | 21 | 22815262 | C | T | 0.07 | 6.290 | 1.374 | 4.68 × 10–6 | intronic |
| 0.299 | edu, high, cog | |
|
| ||||||||||||||
| rs112175941 | PC2 | 1 | 150558293 | T | A | 0.159 | 6.594 | 1.400 | 2.48 × 10–6 | intergenic |
|
| 0.407 | edu, math, high |
| rs6698178 | PC2 | 1 | 37964765 | A | T | 0.351 | 8.954 | 1.901 | 2.47 × 10–6 | intronic |
|
| ‐0.222 | NA |
| rs144916872 | PC2 | 2 | 25125902 | G | T | 0.1 | 5.773 | 1.171 | 8.28 × 10–7 | intronic |
|
| 0.306 | NA |
| rs8654 | PC2 | 5 | 96498783 | A | G | 0.407 | ‐8.740 | 1.935 | 6.30 × 10–6 | synonymous |
|
| NA | NA |
| rs7821522 | PC2 | 8 | 51650500 | A | C | 0.206 | 7.136 | 1.517 | 2.56 × 10–6 | intronic |
| ‐0.352 | NA | |
| rs1426205 | PC2 | 15 | 49117232 | C | A | 0.137 | 6.305 | 1.333 | 2.24 × 10–6 | UTR |
| ‐0.327 | NA | |
| rs67805160 | PC2 | 16 | 83734755 | G | T | 0.167 | 6.730 | 1.464 | 4.26 × 10–6 | intronic |
| ‐0.533 | edu, math, high, cog | |
| rs4984939 | PC2 | 16 | 893181 | G | A | 0.445 | ‐5.378 | 1.213 | 9.33 × 10–6 | intergenic |
| ‐0.347 | edu, math, high, cog | |
Allele frequency.
Log fold change of transcript abundance for AD cases versus controls in AMP‐AD RNA‐Seq data.
cog, cognitive test performance; edu, educational attainment; high, highest math achievement; math, self‐reported math ability; PC, principal component.
These two SNPs are in high LD and have P‐values < 1 × 10–5 in GWAS from the two CNN models.
Gene with differential expression in AMP‐AD RNA‐Seq data.
Abbreviations: AMP‐AD, Accelerating Medicines Partnership‐Alzheimer's Disease; CNN, convolutional neural network; SNP, single nucleotide polymorphism.
FIGURE 3Manhattan plots for (A) principal components (PCs) 1, 4, and 9 of the Image CNN‐derived image features, and (B) PC 2 of the Augmented CNN‐derived image features. Gene names in red text are for imageCNN.PC1, those in blue text are for imageCNN.PC4, imageCNN.PC9, and augmentedCNN.PC2