Literature DB >> 29306584

Assembly of 809 whole mitochondrial genomes with clinical, imaging, and fluid biomarker phenotyping.

Perry G Ridge1, Mark E Wadsworth1, Justin B Miller1, Andrew J Saykin2, Robert C Green3, John S K Kauwe4.   

Abstract

INTRODUCTION: Mitochondrial genetics are an important but largely neglected area of research in Alzheimer's disease. A major impediment is the lack of data sets.
METHODS: We used an innovative, rigorous approach, combining several existing tools with our own, to accurately assemble and call variants in 809 whole mitochondrial genomes.
RESULTS: To help address this impediment, we prepared a data set that consists of 809 complete and annotated mitochondrial genomes with samples from the Alzheimer's Disease Neuroimaging Initiative. These whole mitochondrial genomes include rich phenotyping, such as clinical, fluid biomarker, and imaging data, all of which is available through the Alzheimer's Disease Neuroimaging Initiative website. Genomes are cleaned, annotated, and prepared for analysis. DISCUSSION: These data provide an important resource for investigating the impact of mitochondrial genetic variation on risk for Alzheimer's disease and other phenotypes that have been measured in the Alzheimer's Disease Neuroimaging Initiative samples.
Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  ADNI; Alzheimer's disease; Mitochondrial genetics; Next-generation sequencing; Whole mitochondrial genomes

Mesh:

Substances:

Year:  2018        PMID: 29306584      PMCID: PMC5961720          DOI: 10.1016/j.jalz.2017.11.013

Source DB:  PubMed          Journal:  Alzheimers Dement        ISSN: 1552-5260            Impact factor:   21.566


1. Introduction

Alzheimer’s disease (AD), the most common form of dementia, affects >20 million people worldwide and is the only one of the top 10 causes of death that has no effective treatments [1-3]. Full-time care is required as AD progresses, further impacting patients and their loved ones and stressing the health-care system. With incidence expected to increase to 1 in 85 people by 2050 [2], it is essential to achieve early diagnosis, effective treatments, and a better understanding of the underlying etiology. Understanding the underlying mechanisms of risk for AD is a key for both diagnosis and treatment. Swerdlow et al. [4] proposed the Mitochondrial Cascade Hypothesis of AD. Briefly, an individual’s genetics determine the baseline mitochondrial function and how mitochondria change as a person ages and declining mitochondrial function causes AD-specific pathologies. In addition to the evidence provided by Swerdlow et al. [4], several lines of evidence support a role for mitochondria in AD. First, mitochondria fundamentally change in a number of ways in AD and contribute to its progression and onset [5]: metabolism decreases [6], mitochondrial fusion/fission are disrupted [7], mitochondrial concentration (i.e. the ratio of mitochondrial genomes to nuclear genomes) decreases in cerebrospinal fluid [8,9], mitochondrial morphology changes [4,10], mitochondrial-encoded enzymes in the electron transport chain are altered [5,11], amyloid plaques aggregate in mitochondria [12,13], and many of these changes take place near plaques [14]. Second, individuals with a maternal family history of AD have as high as 9 times the risk of AD compared with individuals with a paternal family history of AD [15,16], or no family history. Furthermore, individuals with a maternal family history of AD also score lower on cognitive tests [17], have a lower age of onset of AD [15,18], and have more pronounced brain abnormalities consistent with AD (e.g. cerebral metabolic changes [19], higher amyloid β burden [20], reduction in gray matter volume [21,22], and increased global Pittsburgh Compound-B uptake Pittsburgh Compound-B-positron emission tomography [23]). Moreover, we found that some of these brain abnormalities are associated with mitochondrial haplotypes [24]. This mitochondrial impact on AD risk could be influenced by several factors, including differential responses to the oxidative stress, variation in nuclear-encoded mitochondrial genes, and variation in the mitochondrial genome. In this article, we focus on an important resource for investigating mitochondrial genomic variation and others [25]. Several groups have reported a relationship between mitochondrial genetics and risk for AD (summarized in Ridge et al. [3], Table 2). Twelve different haplotypes have been implicated in mitochondrial genetic studies, but the majority of these were reported only once and not replicated [26-33], and six different studies reported no association between mitochondrial genetic variants and AD [34-39]. Among reported associations, there is no consensus, and sometimes, results appear to be contradictory. For example, Haplogroup U has been reported as both a risk and protective haplogroup [28,31,32]. However, potentially explaining the confounding nature of discoveries to date, the majority of studies used incomplete sequence data and/or had very small sample sizes [26-39], thus most were underpowered and lacked the resolution to identify correlations for all but the most common haplogroups. Only a single study used whole mitochondrial data [30], whereas most genotyped only a handful of single nucleotide polymorphisms (SNPs). Furthermore, only one study used a large data set, but in this particular data set, the authors only genotyped 138 SNPs [39]. In summary, there is strong evidence to suggest a relationship between the mitochondrial genome and AD, yet the relationship remains undefined.
Table 2

Counts and frequencies of major mitochondrial haplogroups

MajormitochondrialhaplogroupCountFrequency (%)# Cases/controls/MCI/unknown
A50.623/1/1/0
B70.873/1/3/0
C30.370/0/3/0
F10.121/0/0/0
H33841.7873/128/133/4
I273.344/12/11/0
J698.5324/18/26/1
K688.4119/24/25/0
L334.085/16/12/0
M111.362/4/5/0
N70.872/2/3/0
R50.621/3/1/0
T789.6416/25/37/0
U10512.9827/31/46/1
V283.464/10/14/0
W151.853/2/10/0
X91.114/2/3/0

MCI, mild cognitive impairment.

Numbers refer to individuals who have the listed major mitochondrial haplogroup or a subgroup (e.g. individuals with H5 are counted as part of the H group in this table).

The Alzheimer’s Disease Neuroimaging Initiative (ADNI) recently sequenced the whole genomes, including mitochondrial genomes, of 809 individuals. Each of the genomes was analyzed using tools and pipelines developed for diploid genomes. However, these analysis pipelines, particularly variant identification that relies on a likelihood model expecting diploid sequences, are inaccurate for use on the mitochondrial genome, which is haploid. Here, we report not only an AD data set of 809 annotated whole mitochondrial genomes with extensive phenotypes (Table 1) but also an appropriate pipeline to analyze mitochondrial genomes. We hope to facilitate research in this important area by providing a data set and analysis pipeline for future researchers to augment this initial data set.
Table 1

Demographics

CountSex(male/female)AverageageAPOE statusε2/ε2,ε2/ε3,ε3/ε3,ε3/ε4,ε4/ε4,ε2/ε4
Cases191126/6574.420/8/74/80/25/4
Controls279135/14474.510/35/167/68/7/2
MCI333183/149*71.57*1/26/162/110/25/9
Total803444/358*73.17*1/69/403/258/57/15

APOE, apolipoprotein E; MCI, mild cognitive impairment.

Demographic and phenotype information is available for 803 of the 809 mitochondrial genomes in the data set. APOE status refers to APOE genotype.

Missing data for one sample.

2. Methods

2.1. Alzheimer’s Disease Neuroimaging Initiative

Data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public–private partnership and is an ongoing, longitudinal, highly collaborative study. The primary goal of ADNI has been to test whether serial magnetic resonance imaging, positron emission tomography, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment and used for the early diagnosis of AD. ADNI has undergone several phases (ADNI1, ADNI GO, and ADNI 2), with each phase adding additional samples. In 2012, 818 ADNI samples were selected for whole genome sequencing to further the goals of ADNI. DNA sequence data were collected from DNA derived from whole blood. All subjects in our analyses had self-reported ancestry of non-Hispanic European American. All the data (whole genome sequence, phenotype, and newly assembled and annotated whole mitochondrial genomes) are publically available through ADNI (http://adni.loni.usc.edu/data-samples/).

2.2. Genome sequencing, assembly, and variant detection

ADNI genomes were sequenced on an Illumina HiSeq. Reads were paired-end, 100 base-pair reads. Before read mapping, adapters were removed. ADNI mapped the whole genome sequences and called variants using default settings in the Burrows-Wheeler Aligner [40] for mapping and standard best practices from the Genome Analysis Toolkit [41,42] for variant detection (for details see http://adni.loni.usc.edu/data-samples/genetic-data/wgs/). However, these steps needed to be redone for two reasons: First, original mappings were to Hg19. Historically, determining a reference mitochondrial genome has been a bit confusing. The standard mitochondrial genome, NC_012920, was released first in 1981 (the Cambridge Reference Sequence, or CRS) [43] and corrected in 1999 (the new Cambridge Reference Sequence, or rCRS) [44]. NC_012920 is a European haplogroup and thus is a leaf on the mitochondrial haplotype phylogenetic tree. A number of other mitochondrial genomes have been suggested as the “correct” mitochondrial genome reference, including a reconstructed hypothetical mitochondrial genome corresponding to Mitochondrial Eve [45]. Nevertheless, NC_012920 remains the standard version of the mitochondrial genome used in mitochondrial genetics studies. Hg19, which was used for read mapping by ADNI, has a version of the mitochondrial genome, represented as chrM, but sometimes corresponds to NC_001807 and sometimes NC_012920. Therefore, to be consistent with mitochondrial genetic standards, reads needed to be remapped. Second, the standard Genome Analysis Toolkit pipeline includes many steps. One of these steps is the HaplotypeCaller. The HaplotypeCaller builds a likelihood model based on possible reconstructed haplotypes in a genomic region, but it assumes sequences are diploid. Consequently, mitochondrial genomic variants identified using the HaplotypeCaller are likely to include many inaccuracies. Since chrMand NC_012920 only differ by a few bases, we were able to extract only those reads that mapped to chrM (with SAMtools [46]), rather than all reads corresponding to the whole nuclear and mitochondrial genomes. Extracted reads were remapped to NC_012920 using Burrows-Wheeler Aligner. Next, we performed local realignments around indels and base recalibration, which are not affected by ploidy, with Genome Analysis Toolkit to refine the new mappings. Finally, we used FreeBayes (-p 1 –F 0.6, and removed variants with quality less than 20) [47] to joint-call variants and converted the resulting variant call format (VCF) file to fasta with vcf2fasta (vcflib, https://github.com/vcflib/vcflib). An overview of the whole process is outlined in Supplementary Figure 1.

2.3. Genome annotation

We annotated mitochondrial variants and haplotypes for each sample. We downloaded 9228 mitochondrial DNA coding and RNA sequence variants and 2792 control region variants from MITOMAP [48]. For each variant present in the data sets downloaded from MITOMAP, we added complete information (i.e. frequency, source, locus names, etc.) to the “Info” column in the VCF file and added the corresponding annotation information to the header lines. For each variant that was reported by multiple studies in MITOMAP, we included all studies in the annotation. Next, we annotated mitochondrial haplotypes with Phy-Mer [49]. Phy-Mer reports the five most likely mitochondrial haplotypes and a score, where 1 is a perfect score. For each of the samples, we selected the top hit. All samples had scores >0.99 except for one that had a score of 0.988.

2.4. Variant validation

SNP data from the Illumina 2.5 M Array were collected from the same DNA extraction that was used for whole genome sequence (WGS) data collected. We compared 256 mitochondrial variants genotyped on that array to the variant calls from our WGS analysis pipeline. Validation was evaluated by simply looking at the concordance of calls on the individual level for the two sets of genotype data.

3. Results

Mitochondrial genomes were sequenced to an average depth of 2986 reads and ranged from an average depth of 1515 reads to 7831 reads. We identified 1649 total mitochondrial genetic variants from these genomes, of which 1336 have been previously reported. Samples had an average of about 27 variants, with a range of 1 to 96 variants. We validated our variant calls using 256 mitochondrial genetic genotypes from an SNP array performed on the same samples. Overall, 98.18% of WGS variant calls matched SNP genotypes acquired from the array. We identified 506 different mitochondrial haplotypes in the data set, all of which have been previously observed. The majority, 350, only appear in a single sample in the data set. The most common haplotype, K1A1B1A, is shared by 15 individuals. When considering only the major mitochondrial haplogroup (e.g. H, V, U, etc.), the majority of individuals had haplotypes in the H and U groups, 338 and 105, respectively. We report number of individuals in each major mitochondrial haplogroup in Table 2 and counts of all haplotypes in Supplementary Table 1. These frequencies are similar to those observed in other cohorts of non-Hispanic European American ancestry [50]. Several individuals have haplotypes reported to be associated with risk for AD: five have haplotype H5 (a risk haplogroup [29]).

4. Discussion

We have presented the application of a novel approach to accurately assemble and genotype a data set of whole mitochondrial genomes from the ADNI study. Several accurate algorithms exist for calling genotypes in diploid NGS data, but use models that are inappropriate for haploid samples. We used existing software, but tested and refined parameter settings to achieve high levels of genotype accuracy in our mitochondrial genome samples. We validated the identified genotypes by repurposing existing publically available data. Our genotypes are >98% concordant with genotypes from Illumina SNP chips in the same samples. This high level of accuracy is approaching the expected error rate of SNP chips and makes it difficult to definitively determine whether our genotypes, or the SNP chip genotypes, are correct. In addition, we have made these data publically available through the ADNI website. A multisample VCF file with mitochondrial genomic variants for each subject is available for download from the Download—Genetic Data section of the ADNI repository (http://adni.loni.usc.edu/data-samples/access-data/). Also included is an .XLSX file with mitochondrial haplotype information for each subject. The new data sets are named “ADNI WGS Whole mitochondrial genome variants” and “ADNI WGS Whole mitochondrial genome—Haplotypes”. This data set is now ready to be applied in AD studies and to help elucidate the relationship between mitochondria and AD, which has thus far alluded researchers. Complete mitochondrial genomic data result in high resolution of haplotype definition, including large numbers of singleton haplotypes. Methods that group haplotypes in evolutionarily meaningful ways are necessary to fully leverage these data. In our previous analyses of complete mitochondrial genomics data, we incorporated TreeScanning to concentrate statistical power on evolutionarily meaningful groups of haplotypes [24,30,51,52]. This approach has been applied successfully in the study of both mitochondrial and nuclear genomic contributions to AD risk and related phenotypes [24,30,51,53,54]. The ADNI subjects include AD cases, subjects with mild cognitive impairment, and cognitively normal controls. Nearly all of these subjects are also associated with extensive clinical, imaging, and fluid biomarker data, including some longitudinal data. As such, they provide great value in evaluating the multitude of factors that are associated with AD, dementia, dementia progression, and conversion from mild cognitive impairment to dementia. While this study is underpowered to identify haplotypes associated with AD risk, we anticipate it will serve as the foundation for additional data collection and an expanded study in the future. These data will also prove valuable for validation and discovery analyses related to the multitude of other phenotypes that are available for these subjects. Association analyses using imaging and fluid biomarker data are outside the scope of this study, but we anticipate that these data will be leveraged by several groups, including ours, for these kinds of analyses.
  52 in total

1.  Association between mitochondrial DNA variations and Alzheimer's disease in the ADNI cohort.

Authors:  Anita Lakatos; Olga Derbeneva; Danny Younes; David Keator; Trygve Bakken; Maria Lvova; Marty Brandon; Guia Guffanti; Dora Reglodi; Andrew Saykin; Michael Weiner; Fabio Macciardi; Nicholas Schork; Douglas C Wallace; Steven G Potkin
Journal:  Neurobiol Aging       Date:  2010-06-11       Impact factor: 4.673

Review 2.  Alzheimer's disease.

Authors:  Henry W Querfurth; Frank M LaFerla
Journal:  N Engl J Med       Date:  2010-01-28       Impact factor: 91.245

3.  Mitochondrial DNA haplogroups in early-onset Alzheimer's disease and frontotemporal lobar degeneration.

Authors:  Johanna Krüger; Reetta Hinttala; Kari Majamaa; Anne M Remes
Journal:  Mol Neurodegener       Date:  2010-02-02       Impact factor: 14.195

4.  Sequence and organization of the human mitochondrial genome.

Authors:  S Anderson; A T Bankier; B G Barrell; M H de Bruijn; A R Coulson; J Drouin; I C Eperon; D P Nierlich; B A Roe; F Sanger; P H Schreier; A J Smith; R Staden; I G Young
Journal:  Nature       Date:  1981-04-09       Impact factor: 49.962

5.  mtDNA Variation and Analysis Using Mitomap and Mitomaster.

Authors:  Marie T Lott; Jeremy N Leipzig; Olga Derbeneva; H Michael Xie; Dimitra Chalkia; Mahdi Sarmady; Vincent Procaccio; Douglas C Wallace
Journal:  Curr Protoc Bioinformatics       Date:  2013-12

6.  Variability of age at onset in siblings with familial Alzheimer disease.

Authors:  Estrella Gómez-Tortosa; M Sagrario Barquero; Manuel Barón; M Jose Sainz; Sagrario Manzano; Maria Payno; Raquel Ros; Carmen Almaraz; Pilar Gómez-Garré; Adriano Jiménez-Escrig
Journal:  Arch Neurol       Date:  2007-12

7.  Reduced gray matter volume in normal adults with a maternal family history of Alzheimer disease.

Authors:  R A Honea; R H Swerdlow; E D Vidoni; J Goodwin; J M Burns
Journal:  Neurology       Date:  2010-01-12       Impact factor: 9.910

8.  Mitochondrial haplogroups associated with Japanese Alzheimer's patients.

Authors:  Shigeru Takasaki
Journal:  J Bioenerg Biomembr       Date:  2009-10       Impact factor: 2.945

9.  No mitochondrial haplotype was found to increase risk for Alzheimer's disease.

Authors:  G Zsurka; J Kálmán; A Császár; I Raskó; Z Janka; P Venetianer
Journal:  Biol Psychiatry       Date:  1998-09-01       Impact factor: 13.382

10.  Declining brain glucose metabolism in normal individuals with a maternal history of Alzheimer disease.

Authors:  L Mosconi; R Mistur; R Switalski; M Brys; L Glodzik; K Rich; E Pirraglia; W Tsui; S De Santi; M J de Leon
Journal:  Neurology       Date:  2008-11-12       Impact factor: 9.910

View more
  8 in total

1.  Mitochondrial DNA variation in Alzheimer's disease reveals a unique microprotein called SHMOOSE.

Authors:  Brendan Miller; Su-Jeong Kim; Hemal H Mehta; Kevin Cao; Hiroshi Kumagai; Neehar Thumaty; Naphada Leelaprachakul; Henry Jiao; Joan Vaughan; Jolene Diedrich; Alan Saghatelian; Thalida E Arpawong; Eileen M Crimmins; Nilüfer Ertekin-Taner; Meral A Tubi; Evan T Hare; Meredith N Braskie; Léa Décarie-Spain; Scott E Kanoski; Francine Grodstein; David A Bennett; Lu Zhao; Arthur W Toga; Junxiang Wan; Kelvin Yen; Pinchas Cohen
Journal:  Mol Psychiatry       Date:  2022-09-21       Impact factor: 13.437

2.  Mitochondrial Genetics Reinforces Multiple Layers of Interaction in Alzheimer's Disease.

Authors:  Giovanna Chaves Cavalcante; Leonardo Miranda Brito; Ana Paula Schaan; Ândrea Ribeiro-Dos-Santos; Gilderlanio Santana de Araújo
Journal:  Biomedicines       Date:  2022-04-12

Review 3.  Mitochondria and Alzheimer's Disease: the Role of Mitochondrial Genetic Variation.

Authors:  Perry G Ridge; John S K Kauwe
Journal:  Curr Genet Med Rep       Date:  2018-03-01

4.  A globally diverse reference alignment and panel for imputation of mitochondrial DNA variants.

Authors:  Tim W McInerney; Brian Fulton-Howard; Christopher Patterson; Devashi Paliwal; Lars S Jermiin; Hardip R Patel; Judy Pa; Russell H Swerdlow; Alison Goate; Simon Easteal; Shea J Andrews
Journal:  BMC Bioinformatics       Date:  2021-09-01       Impact factor: 3.307

5.  Systems Genetic Identification of Mitochondrion-Associated Alzheimer's Disease Genes and Implications for Disease Risk Prediction.

Authors:  Xuan Xu; Hui Wang; David A Bennett; Qing-Ye Zhang; Gang Wang; Hong-Yu Zhang
Journal:  Biomedicines       Date:  2022-07-24

6.  Prediction of Conversion from CIS to Clinically Definite Multiple Sclerosis Using Convolutional Neural Networks.

Authors:  H M Rehan Afzal; Suhuai Luo; Saadallah Ramadan; Manju Khari; Gopal Chaudhary; Jeannette Lechner-Scott
Journal:  Comput Math Methods Med       Date:  2022-07-15       Impact factor: 2.809

7.  Mitonuclear interactions influence Alzheimer's disease risk.

Authors:  Shea J Andrews; Brian Fulton-Howard; Christopher Patterson; G Peggy McFall; Alden Gross; Elias K Michaelis; Alison Goate; Russell H Swerdlow; Judy Pa
Journal:  Neurobiol Aging       Date:  2019-09-24       Impact factor: 4.673

8.  Association of mitochondrial variants and haplogroups identified by whole exome sequencing with Alzheimer's disease.

Authors:  Xiaoling Zhang; John J Farrell; Tong Tong; Junming Hu; Congcong Zhu; Li-San Wang; Richard Mayeux; Jonathan L Haines; Margaret A Pericak-Vance; Gerard D Schellenberg; Kathryn L Lunetta; Lindsay A Farrer
Journal:  Alzheimers Dement       Date:  2021-06-20       Impact factor: 16.655

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.