| Literature DB >> 31964685 |
Zhiqiang Wu1,2, Gus Waneka1, Daniel B Sloan3.
Abstract
The mechanisms of sequence divergence in angiosperm mitochondrial genomes have long been enigmatic. In particular, it is difficult to reconcile the rapid divergence of intergenic regions that can make non-coding sequences almost unrecognizable even among close relatives with the unusually high levels of sequence conservation found in genic regions. It has been hypothesized that different mutation and repair mechanisms act on genic and intergenic sequences or alternatively that mutational input is relatively constant but that selection has strikingly different effects on these respective regions. To test these alternative possibilities, we analyzed mtDNA divergence within Arabidopsis thaliana, including variants from the 1001 Genomes Project and changes accrued in published mutation accumulation (MA) lines. We found that base-substitution frequencies are relatively similar for intergenic regions and synonymous sites in coding regions, whereas indel and nonsynonymous substitutions rates are greatly depressed in coding regions, supporting a conventional model in which mutation/repair mechanisms are consistent throughout the genome but differentially filtered by selection. Most types of sequence and structural changes were undetectable in 10-generation MA lines, but we found significant shifts in relative copy number across mtDNA regions for lines grown under stressed vs. benign conditions. We confirmed quantitative variation in copy number across the A. thaliana mitogenome using both whole-genome sequencing and droplet digital PCR, further undermining the classic but oversimplified model of a circular angiosperm mtDNA structure. Our results suggest that copy number variation is one of the most fluid features of angiosperm mitochondrial genomes.Entities:
Keywords: copy number variation; mutation accumulation line; mutation rate; recombination; single-nucleotide polymorphisms
Mesh:
Substances:
Year: 2020 PMID: 31964685 PMCID: PMC7056966 DOI: 10.1534/g3.119.401023
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Variant statistics for 1001 Genomes dataset. SNPs: single nucleotide polymorphisms; MAF: minor allele frequency.
| Sequence Type | Sites | SNPs | SNPs per Site | SNP MAF | Indels | Indels per Site | Indel MAF |
|---|---|---|---|---|---|---|---|
| Protein Coding | 31264 | 41 | 0.0013 | 0.0206 | 0 | 0.0000 | NA |
| Nonsynonymous | 24323 | 22 | 0.0009 | 0.0244 | 0 | 0.0000 | NA |
| Synonymous | 6941 | 19 | 0.0027 | 0.0163 | 0 | 0.0000 | NA |
| rRNA | 5222 | 3 | 0.0006 | 0.0010 | 0 | 0.0000 | NA |
| tRNA | 1689 | 0 | 0.0000 | NA | 0 | 0.0000 | NA |
| Pseudogene | 1256 | 5 | 0.0040 | 0.0025 | 0 | 0.0000 | NA |
| Intron | 35335 | 72 | 0.0020 | 0.0218 | 18 | 0.0005 | 0.0116 |
| Intergenic | 293042 | 987 | 0.0034 | 0.0263 | 172 | 0.0006 | 0.0239 |
Figure 1Sequencing coverage variation across mitogenome of Arabidopsis thaliana mutation accumulation lines. Each panel represents an average of three biological replicates. Red vertical lines at the bottom of the figure represent the two pairs of large, identical repeats in the A. thaliana mitogenome. When each Illumina read is mapped to these repeats, bowtie2 randomly assigns the read to one copy, so coverage estimates are not expected to be elevated in these regions. The blue dashed line indicates mean coverage.
Figure 2Divergence in region-specific mitogenome copy number in salt-stressed vs. control mutation accumulation lines. Values are expressed as a ratio of the averages for all salt-stressed and all control lines. Windows that deviate significantly from a ratio of 1 after false-discovery-rate correction are highlighted in red. CPMM: counts per million mapped reads.
Figure 3Sequencing coverage variation across the mitogenome for three purified mtDNA samples from Arabidopsis thaliana. The windows chosen for development of ddPCR markers are shown in red and blue dots (high- and low-coverage regions, respectively). Red vertical lines at the bottom of the figure represent the two pairs of large, identical repeats in the A. thaliana mitogenome. When each Illumina read is mapped to these repeats, bowtie2 randomly assigns the read to one copy, so coverage estimates are not expected to be elevated in these regions. The blue dashed line indicates mean coverage.
Figure 4ddPCR comparison of copy number for mitogenome regions identified as either high-copy or low-copy by sequencing analysis. Copy numbers are expressed as per μl of ddPCR reaction volume. Input for the mtDNA samples was diluted 200-fold relative to the total-cellular sample.