| Literature DB >> 31776335 |
B A Jonsson1,2, G Bjornsdottir1, T E Thorgeirsson1, L M Ellingsen2, G Bragi Walters1,2, D F Gudbjartsson1,2, H Stefansson1, K Stefansson3,4, M O Ulfarsson5,6.
Abstract
Machine learning algorithms can be trained to estimate age from brain structural MRI. The difference between an individual's predicted and chronological age, predicted age difference (PAD), is a phenotype of relevance to aging and brain disease. Here, we present a new deep learning approach to predict brain age from a T1-weighted MRI. The method was trained on a dataset of healthy Icelanders and tested on two datasets, IXI and UK Biobank, utilizing transfer learning to improve accuracy on new sites. A genome-wide association study (GWAS) of PAD in the UK Biobank data (discovery set: [Formula: see text], replication set: [Formula: see text]) yielded two sequence variants, rs1452628-T ([Formula: see text], [Formula: see text]) and rs2435204-G ([Formula: see text], [Formula: see text]). The former is near KCNK2 and correlates with reduced sulcal width, whereas the latter correlates with reduced white matter surface area and tags a well-known inversion at 17q21.31 (H2).Entities:
Mesh:
Year: 2019 PMID: 31776335 PMCID: PMC6881321 DOI: 10.1038/s41467-019-13163-9
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Illustration of the proposed method and input data. a A flowchart showing a high-level overview of the proposed brain age prediction system. b Examples of image types generated by the preprocessing step. From left to right: a registered T1-weighted slice, a Jacobian map slice, a GM segmented slice, and a WM segmented slice.
Chronological age prediction accuracy for the considered methods.
| Type | Method | Val MAE | Val | Test MAE | Test | No. I | |
|---|---|---|---|---|---|---|---|
| (A) | T1-weighted | CNN | 1815 | ||||
| Jacobian | CNN | 4.801 | 0.710 | 4.804 | 0.758 | 1815 | |
| Gray matter | CNN | 4.766 | 0.721 | 4.641 | 0.776 | 1815 | |
| White matter | CNN | 4.676 | 0.735 | 4.189 | 0.812 | 1815 | |
| (B) | MV (T1 and JM) | CNN | 4.102 | 0.803 | 3.919 | 0.841 | 1815 |
| MV (GM and WM) | CNN | 4.172 | 0.790 | 3.674 | 0.849 | 1815 | |
| MV (T1, JM, and GM) | CNN | 3.964 | 0.813 | 3.838 | 0.847 | 1815 | |
| MV (T1, JM, GM, and WM) | CNN | 3.845 | 3.584 | 0.849 | 1815 | ||
| LRB (T1, JM, GM, and WM) | CNN | 0.847 | 1815 | ||||
| (C) | SBM | RR | 5.268 | 0.689 | 5.176 | 0.697 | 1320 |
| VBM | GPR | 4.278 | 0.781 | 4.317 | 1794 | ||
| SM | RR | 4.898 | 0.722 | 4.937 | 0.728 | 1815 | |
| MV (SBM, VBM, and SM) | GPR/RR | 4.008 | 0.808 | 3.940 | 0.761 | 1246 | |
| LRB (SBM, VBM, and SM) | GPR/RR | 1246 |
(A) The best results are shown in bold. (B) The training/validation/test split is the same as for (A).
(C) The cross validation was performed using 10-fold cross validation. The SBM feature training/test split was 1056/264, the VBM feature training/test split was 1438/356, and the SM feature training/test split was 1469/346
(A) The performance of the CNNs that were trained using T1-weighted images, Jacobian maps, GM and WM segmented images. Training set (), validation set (), and test set (). (B) The performance when combining CNN predictions. (C) The results of the best methods trained on SBM, VBM and similarity matrix features
CV cross validation, GM gray matter, I images, JM Jacobian map, LRB linear regression blender, MV majority voting, MAE mean absolute error, RR ridge regression, SM similarity matrix, SBM surface-based morphometry, val validation, VBM voxel-based morphometry, WM white matter
UK Biobank and IXI prediction performance with and without transfer learning.
| IXI | UK Biobank | |||||
|---|---|---|---|---|---|---|
| TL Used | Val MAE | Val | Val set size | Test MAE | Test | Test set size |
| No | 6.420 | 0.778 | 104 | 8.494 | −0.630 | 12395 |
| Yes | 104 | 12395 |
The best results are shown in bold
S subjects, TL transfer learning, val validation
Pearson’s r correlation between PAD and performance on neuropsychological tests.
| Neuropsychological test | PAD correlation | 95% CI | No. subjects | |
|---|---|---|---|---|
| DSST | −0.080 | (−0.104, −0.054) | 4.3e−11 | 6849 |
| TMT B | 0.076 | (0.051, 0.103) | 3.1e−09 | 6076 |
| TMT A | 0.053 | (0.027, 0.078) | 3.8e−05 | 6076 |
| TMT B - A | 0.050 | (0.024, 0.075) | 1.3e−04 | 5918 |
| Reaction time | 0.030 | (0.012, 0.047) | 7.9e−04 | 12387 |
Negative DSST, positive TMT, and positive reaction time indicate worse performance
CI confidence interval, DSST digit substitution test, TMT trail making test
Fig. 2Manhattan plot of the GWAS results for the UK Biobank data. The horizontal line denotes the P value threshold for genome-wide significant effect.
Sequence variants associated with PAD estimated using BOLT-LMM.
| rs Number (GRCh38) | Position (min/maj) | Allele | MAF ( | Effect | 95% CI | ||
|---|---|---|---|---|---|---|---|
| (A) | rs2435204 | chr17:45910839 | G/A | 26.6 | 0.11 | (0.08, 0.14) | 1.4e−12 |
| rs1452628 | chr1:214966544 | T/A | 36.2 | −0.08 | (−0.10, −0.05) | 2.3e−09 | |
| (B) | rs2790099 | chr6:45475612 | C/T | 36.0 | −0.06 | (−0.09, −0.03) | 8.9e−06 |
| rs6437412 | chr3:194747684 | C/T | 28.2 | −0.06 | (−0.09, −0.04) | 6.8e−06 | |
| rs2184968 | chr6:126439848 | C/T | 46.0 | 0.05 | (0.03, 0.08) | 7.5e−05 | |
| (C) | rs2435204 | chr17:45910839 | G/A | 26.6 | 0.08 | (0.03, 0.13) | 1.5e−03 |
| rs1452628 | chr1:214966544 | T/A | 36.2 | −0.07 | (−0.12, −0.03) | 8.8e−04 | |
| (D) | rs2790099 | chr6:45475612 | C/T | 36.0 | −0.07 | (−0.11, −0.02) | 2.9e−03 |
| rs6437412 | chr3:194747684 | C/T | 28.2 | −0.05 | (−0.09, 0.00) | 4.9e−02 | |
| rs2184968 | chr6:126439848 | C/T | 46.0 | 0.06 | (0.02, 0.10) | 2.9e−03 |
(A, B) Association between sequence variants and PAD for 12378 subjects in discovery set. (A) Genome-wide significant sequence variants. (B) Sequence variants associated with structural MRI brain phenotypes that also associate with PAD. (C, D) Association between sequence variants and PAD for 4456 subjects from the replication set. (C) Genome-wide significant sequence variants. (D) Sequence variants associated with structural MRI brain phenotypes that also associate with PAD. Note that the reported effect sizes are for PAD normalized to unit variance. Before normalization the standard deviation of PAD was ~4 years. Thus the associated lowering of the protective allele of rs1452628 is approximately −0.32 years
CI confidence interval, MAF minor allele frequency
Fig. 3A flowchart showing the components of the proposed CNN architecture. Residual (Res), fully connected (FC).
Fig. 4A flowchart showing the components of the proposed residual block. Batch re-normalization (BRN), convolutional layer (Conv).
Fig. 5A flowchart showing the components of the proposed fully connected block. Fully connected layer one (FC1), concatenation layer (Concat), fully connected layer two (FC2).