| Literature DB >> 31612134 |
Hannah Maude1,2, Mira Davidson1, Natalie Charitakis1, Leo Diaz1, William H T Bowers1, Eva Gradovich1, Toby Andrew2, Derek Huntley1.
Abstract
Homology between mitochondrial DNA (mtDNA) and nuclear DNA of mitochondrial origin (nuMTs) causes confounding when aligning short sequence reads to the reference human genome, as the true sequence origin cannot be determined. Using a systematic in silico approach, we here report the impact of all potential mitochondrial variants on alignment accuracy and variant calling. A total of 49,707 possible mutations were introduced across the 16,569 bp reference mitochondrial genome (16,569 × 3 alternative alleles), one variant at-at-time. The resulting in silico fragmentation and alignment to the entire reference genome (GRCh38) revealed preferential mapping of mutated mitochondrial fragments to nuclear loci, as variants increased loci similarity to nuMTs, for a total of 807, 362, and 41 variants at 333, 144, and 27 positions when using 100, 150, and 300 bp single-end fragments. We subsequently modeled these affected variants at 50% heteroplasmy and carried out variant calling, observing bias in the reported allele frequencies in favor of the reference allele. Four variants (chrM:6023A, chrM:4456T, chrM:5147A, and chrM:7521A) including a possible hypertension factor, chrM:4456T, caused 100% loss of coverage at the mutated position (with all 100 bp single-end fragments aligning to homologous, nuclear positions instead of chrM), rendering these variants undetectable when aligning to the entire reference genome. Furthermore, four mitochondrial variants reported to be pathogenic were found to cause significant loss of coverage and select haplogroup-defining SNPs were shown to exacerbate the loss of coverage caused by surrounding variants. Increased fragment length and use of paired-end reads both improved alignment accuracy.Entities:
Keywords: NGS; genotype; heteroplasmy; mitochondrial genotype; mitochondrial variants; mtDNA; nuMT
Year: 2019 PMID: 31612134 PMCID: PMC6773831 DOI: 10.3389/fcell.2019.00201
Source DB: PubMed Journal: Front Cell Dev Biol ISSN: 2296-634X
FIGURE 1chrM coverage following whole-genome alignment of fragments generated from the reference mitochondrial genome (rCRS). The coverage is plotted as a percentage of the maximum (100×, 150×, 300× and 200×, 300×, 600× for 100 bp, 150 bp, and 300 bp single-end and paired-end fragments, respectively). Alignment was carried out using bwa mem and Isaac4. Loss of coverage is observed where fragments are assigned a primary alignment other than chrM.
FIGURE 2Coverage loss when aligning mitochondrial fragments containing alternative alleles. Each data point represents the alignment of the fragmented rCRS with one alternative allele (16,569 bp × 3 alternative alleles = 49,707). The % coverage loss at each mutated position is calculated as the coverage at the mutated position divided by the coverage at the same position when aligning with the reference allele. Variants which cause a loss greater than 0% using single-end fragments and greater than 3% for paired-end fragments (allowing for variation in the insert size) are plotted, allowing for up to three points per base pair position.
FIGURE 3Coverage following alignment of the fragmented mitochondrial genome with the example mutation chrM:6023G > A (red) and the fragmented, unaltered rCRS mitochondrial genome (black). Alignments were carried out using the bwa mem alignment algorithm and single-end fragments of 100 bp, 150 bp, and 300 bp.
FIGURE 4Observed minor allele frequency for mitochondrial variants modeled at 50% heteroplasmy (MAF = 0.5). Variants which cause a loss in coverage of >0% compared to the reference allele are plotted. Alignment to GRCh38 was carried out using bwa mem and variants were called using mitoCaller. The true minor allele frequency (MAF) of all variants is 0.5 although the observed MAF may be lower due to preferential alignment of fragments containing the alternative allele to nuclear loci.