| Literature DB >> 28441938 |
Julien Marquis1, Gregory Lefebvre2, Yiannis A I Kourmpetis2, Mohamed Kassam2, Frédéric Ronga3, Umberto De Marchi4, Andreas Wiederkehr4, Patrick Descombes2.
Abstract
BACKGROUND: Mitochondrial dysfunction is linked to numerous pathological states, in particular related to metabolism, brain health and ageing. Nuclear encoded gene polymorphisms implicated in mitochondrial functions can be analyzed in the context of classical genome wide association studies. By contrast, mitochondrial DNA (mtDNA) variants are more challenging to identify and analyze for several reasons. First, contrary to the diploid nuclear genome, each cell carries several hundred copies of the circular mitochondrial genome. Mutations can therefore be present in only a subset of the mtDNA molecules, resulting in a heterogeneous pool of mtDNA, a situation referred to as heteroplasmy. Consequently, detection and quantification of variants requires extremely accurate tools, especially when this proportion is small. Additionally, the mitochondrial genome has pseudogenized into numerous copies within the nuclear genome over the course of evolution. These nuclear pseudogenes, named NUMTs, must be distinguished from genuine mtDNA sequences and excluded from the analysis.Entities:
Keywords: Heteroplasmy; Low frequency variant; MitoRS; Mitochondria; Mitochondrial DNA; Next generation sequencing; Rolling circle amplification; Somatic mutation
Mesh:
Substances:
Year: 2017 PMID: 28441938 PMCID: PMC5405551 DOI: 10.1186/s12864-017-3695-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Rolling circle amplification of mitochondrial DNA. a Rolling circle amplification significantly enriches the mitochondrial DNA versus nucDNA. Absolute quantification by qPCR was performed to evaluate the ratio between mtDNA and nucDNA with or without RCA. Copy numbers are calculated from standard curves. Results are shown in log10 scale. Standard deviation are calculated from three independent qPCR experiments, on the same sample for the non-amplified material, and from three independent RCA reactions for the amplified samples. mm: mus musculus, hs: homo sapiens. b Nuclear reads contamination (NUMTs) does not affect MitoRS. Sequencing reads generated from a human control cell line (143-B) or its mitochondria-free derivative (143-B-Rho0) were mapped against the human mtDNA reference (rCRS, NC_012920.1). The ratio of the absolute coverage (reported by mpileup) between the Rho and control cell line was calculated for each position of the reference genome and plotted. The two datasets are generated from the same total number of reads (~15 million). Note that these samples had to be sequenced 10 times more than the usual procedure because of too few mapping reads for the Rho0 sample
Fig. 2RCA does not introduce coverage biases. Sequencing reads were generated from two plasmid DNA with or without the RCA step. The relative coverage reported by mpileup was plotted against each position of the reference genome. A single replicate is presented though the exact same patterns were observed for independent triplicates. Top panel: plasmid1, bottom panel: plasmid2
Fig. 3RCA does not introduce sequencing errors. The difference in absolute frequencies between the non-amplified samples and the RCA samples was computed for each single position of the reference genome. The positions with a non-null difference are plotted as a bar. a SNV and b indels are plotted in two different graphs. The two non-concordant positions from the SNV panel resulting from unspecific RCA amplification are marked with a star (see the Additional file 2: Supporting Information). The calculation was made as the average frequency within the four sample replicates. Only passing filter variants were considered. Left panel: SNV, right panel: indels
Fig. 4Benchmarking MitoRS accuracy and sensitivity. a SNV detection is accurate over the whole range of frequencies. Total DNA extracted from two mouse strains was mixed at different ratios and run through the pipeline. The measured frequency of the 88 homoplasmic SNV distinguishing the mtDNA from the two strains are plotted versus the calculated ratio from the input mixture. Red dots correspond to the mean frequency calculated from the 88 variants, internal blue bars show the 25th and 75th quartile, and extremal grey bars the minimum and maximum variant frequency observed for a given ratio. The correlation factor and the slope of the linear regression are shown on the graph. Three independent input mixtures were run for each theoretical ratio. The insert panel is a zoom on the low frequency ratios. b Indels analysis is also accurate. Same graph as in A., but considering only the two indel positions distinguishing the mtDNA from the two strains. The insert panel is a zoom on the low frequency ratios. Results are calculated as the average and standard deviation of the three independent RCA reactions performed for each ratio
Fig. 5Mitochondrial DNA variant heritability. The DNA from the 17 and 18 members of the CEPH families 1463 (a) and 884 (b), respectively, were analyzed with MitoRS. Each haplotype was determined. Mitochondrial DNA variants were classified into three categories based on their frequency status: homoplasmic (>98%), high frequency heteroplasmy (between 10 and 98%) and low frequency heteroplasmy (between 1 and 10%)
No evidence for father’s mtDNA transmission
Variants specific for the father (i.e. not also present in the mother) are shown. For easier visualization, homoplasmic variants passing filters are highlighted in red, high frequency heteroplasmy in orange, low frequency heteroplasmy in yellow, and positions not passing filters are left in blank. Each variant is ordered by lane and identified by its position (rCRS numbering). The positions highlighted in blue were verified by Sanger sequencing
Transmission of variants from the mother’s mtDNA
Variant identified in the mothers or in the children are shown. Mothers #128892 and #12878 are shown in the same table to account for the three generations inheritance. The color code is the same as in Table 1. Each variant is ordered by lane and identified by its position (rCRS numbering). The positions highlighted in blue were verified by Sanger sequencing