| Literature DB >> 31974348 |
Mark Pinese1,2,3, Paul Lacaze4, Emma M Rath1, Andrew Stone1,2,3,5, Marie-Jo Brion1, Adam Ameur4,6, Sini Nagpal7, Clare Puttick1, Shane Husson1, Dmitry Degrave1, Tina Navin Cristina8, Vivian F S Kahl9, Aaron L Statham1, Robyn L Woods4, John J McNeil4, Moeen Riaz4, Margo Barr10, Mark R Nelson4,11, Christopher M Reid4,12, Anne M Murray13,14, Raj C Shah15, Rory Wolfe4, Joshua R Atkins16,17, Chantel Fitzsimmons16,17, Heath M Cairns16,17, Melissa J Green18,19, Vaughan J Carr18,19,20, Mark J Cowley1,2,3, Hilda A Pickett9, Paul A James21,22, Joseph E Powell23,24, Warren Kaplan1,5, Greg Gibson7, Ulf Gyllensten6, Murray J Cairns16,17, Martin McNamara8, Marcel E Dinger1,25, David M Thomas26,27.
Abstract
Population health research is increasingly focused on the genetic determinants of healthy ageing, but there is no public resource of whole genome sequences and phenotype data from healthy elderly individuals. Here we describe the first release of the Medical Genome Reference Bank (MGRB), comprising whole genome sequence and phenotype of 2570 elderly Australians depleted for cancer, cardiovascular disease, and dementia. We analyse the MGRB for single-nucleotide, indel and structural variation in the nuclear and mitochondrial genomes. MGRB individuals have fewer disease-associated common and rare germline variants, relative to both cancer cases and the gnomAD and UK Biobank cohorts, consistent with risk depletion. Age-related somatic changes are correlated with grip strength in men, suggesting blood-derived whole genomes may also provide a biologic measure of age-related functional deterioration. The MGRB provides a broadly applicable reference cohort for clinical genetics and genomic association studies, and for understanding the genetics of healthy ageing.Entities:
Mesh:
Year: 2020 PMID: 31974348 PMCID: PMC6978518 DOI: 10.1038/s41467-019-14079-0
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Summary metrics for the first release of the MGRB well elderly cohort.
| Measure | ASPREE | 45 and Up |
|---|---|---|
| Individuals (percent female) | 1853 (48.2%) | 717 (59.3%) |
| Age at blood draw (years) | 79 (75–95) | 70 (64–91) |
| Height (m) | 1.65 (1.33–1.91) | 1.66 (1.37–1.91) |
| Mass (kg) | 74.5 (33.4–127.1) | 72.0 (36.0–147.0) |
| Mean sequencing depth (genome-wide) | 38.0 (26.8–46.0) | 39.0 (27.3–45.5) |
| Genetic background | ||
| Non-Finnish European | 1805 | 695 |
| South and Central American | 23 | 5 |
| South Asian | 14 | 6 |
| Finnish European | 10 | 7 |
| East Asian | 1 | 4 |
Samples were sourced from the ASPREE or 45 and Up studies. Aggregate statistics are medians, with ranges in parentheses. Genetic background (ancestry) was determined from genotype data Although blood was occasionally drawn at younger than 70 years, all individuals lived to at least 70 years without known cancer, cardiovascular disease, or dementia
Counts of clinically significant small variation in the MGRB for all genes in the ACMG SF 2.0 set.
| Condition | Gene | Carriers |
|---|---|---|
| Cancer | 4 (2 female) | |
| 1 | ||
| 1 | ||
| 3 | ||
| Neurofibromatosis | 1 | |
| ARVC | 1 | |
| 3 | ||
| CPVT | 1 | |
| HCM, DCM | 2 | |
| 1 | ||
| 1 | ||
| Hypercholesterolaemia | 5 | |
| Long QT, VA | 1 | |
| 1 | ||
| Marfan syndrome | 1 | |
| Malignant hyperthermia | 1 | |
| Total | 28 |
ARVC arrhythmogenic right ventricular cardiomyopathy, CPVT catecholaminergic polymorphic ventricular tachycardia, HCM hypertrophic cardiomyopathy, DCM dilated cardiomyopathy, VA ventricular arrhythmia
Fig. 1The MGRB is depleted for genomic risk relative to reference and disease cohorts.
a The rate of rare pathogenic variants in tumour suppressor genes is lower in MGRB than in a cohort of cancer cases (log odds for an individual to carry a pathogenic tumour suppressor variant shown). b The MGRB also has lower polygenic score (PS) estimates for a range of phenotypes, when compared to the gnomAD non-Finnish European population and the UK Biobank samples. MGRB is the reference in b, with PS mean set at zero; bootstrap 95% confidence intervals are shown for the difference in PS between MGRB and the reference cohorts (UKBB or gnomAD); higher values indicate a higher polygenic score in UKBB or gnomAD. q-Values represent false discovery rate estimates by the Benjamini–Hochberg method[70]. c The MGRB has lower PS compared to prostate cancer cases, here considering only samples from the 45 and Up Study. d For any given sample size, the MGRB has greater statistical power to detect PS difference from a case cohort than UKBB, demonstrated here for prostate cancer. AU arbitrary units.
Fig. 2Polygenic risk is strongly related to cancer diagnosis.
Cumulative distribution functions (top panels) and associated probability of cancer diagnosis by age 70 (bottom panels) are shown for both prostate cancer (a) and colorectal cancer (b). Unaffected individuals are MGRB men (prostate), or all MGRB individuals (colorectal) and were completely cancer-free up to age 70; affected individuals were sourced from the 45 and Up Study cancer cohort and had recorded evidence of the relevant cancer diagnosis prior to age 70. Polygenic scores were computed based on reported loci and model coefficients[54,57]. Fits are from logistic regression using a GCV-penalised thin plate spline smooth; bands denote 95% confidence intervals around the mean.
Fig. 3Age-related somatic changes are associated with measures of physical function.
Across multiple cohorts, a consistent decrease with age is observed for telomere length (a), mitochondria per nucleus (b), and Y copy number in males (c). In contrast, advanced age is associated with an increase in somatic mutation burden (d, e) and the fraction of samples with detectable clonal haematopoiesis (f), as well as a decrease in the key physical function measures gait speed (g) and grip strength (h). The count of mitochondria per nucleus is significantly related to grip strength beyond age alone in men, as indicated by the change in effective age as judged by grip strength with varying mitochondria count (i). For a–c, g, h individual measurements corrected for cohort batch effect are shown with LOESS smooths, and for d a logistic fit was used. Bands around estimates delimit 99% confidence intervals for the mean. Sample numbers were 1853 for the ASPREE cohort, 717 for the 45 and Up Study, and 344 for the ASRB cohort.
The rates of somatic measure change with age are different between middle-aged and old individuals.
| Measure | ASRB | 45 and Up Study | ASPREE |
|---|---|---|---|
| Individuals | 344 | 717 | 1853 |
| Percent female | ND | 59.3% | 48.2% |
| Median age (range) | 40 (18–65) | 70 (64–1) | 79 (75–95) |
| Telomere length (AU/decade) | 0.040 [−0.010, 0.090] | ||
| Mitochondria count (log10 mt/nucleus/decade) | −0.004 [−0.018, 0.010] | ||
| Y copy number in males (Y chromosomes/nucleus/decade) | −0.011 [−0.022, 0.001] | ||
| Somatic variant burden (log10 variants/Mb/decade) | 0.038 [−0.002, 0.079] | ||
| Mitochondrial variants (mt variants/decade) | 0.051 [−0.177, 0.278] |
Numbers show the rate of change of each somatic measure with age in the middle-aged ASRB cohort (median age 40), and the older MGRB cohorts (median age 70 or older). Changes are significantly different between the younger ASRB and older MGRB cohorts, and consistent within the two older MGRB cohorts. Linear model slopes as change per decade are reported for each of five somatic measures in each cohort, with 95% Wald confidence intervals. Values significantly different from zero are represented in bold. Note that somatic burden and mitochondrial count per nucleus are reported on the natural logarithm scale. ND, not determined due to data use agreement constraints
Quantity in megabases of genomic regions in each locus confidence tier.
| Locus confidence tier | Reference genome | Canonical chromosomes | CCDS |
|---|---|---|---|
| 1—highest | 1212 | 1212 | 25.40 |
| 2 | 52 | 52 | 1.19 |
| 3—lowest | 1874 | 1832 | 5.88 |
| Total | 3137 | 3096 | 32.47 |
Canonical chromosomes are 1–22, X, and Y; CCDS represents the Consensus CDS set
Quality metric conditions for samples to pass quality control.
| Metric | Initial QC (Infinium SNPs) | Final QC (full genotypes) |
|---|---|---|
| Call rate | >0.98 | >0.98 |
| Depth standard deviation | <10 | <10 |
| VAF standard deviation at loci called heterozygous | <1 | <1 |
| Heterozygous:homozygous variant ratio | <2 | <2 |
| X chromosome inbreeding coefficient | <0.2 or >0.8 | Not tested |
| Singleton rate | <0.001 | Not tested |
Two rounds of quality control (QC) were performed, with different metric cutoffs: a first round based on genotypes at Illumina Infinium QC Array 24 SNPs only and a second round based on genotypes called across the whole-genome. Only samples passing all cutoffs in both rounds were included in the MGRB Phase 2 release