| Literature DB >> 30886148 |
Shuqing Xu1, Jessica Stapley2, Saskia Gablenz3, Justin Boyer3, Klaus J Appenroth4, K Sowjanya Sree5, Jonathan Gershenzon3, Alex Widmer6, Meret Huber7,8.
Abstract
Mutation rate and effective population size (Ne) jointly determine intraspecific genetic diversity, but the role of mutation rate is often ignored. Here we investigate genetic diversity, spontaneous mutation rate and Ne in the giant duckweed (Spirodela polyrhiza). Despite its large census population size, whole-genome sequencing of 68 globally sampled individuals reveals extremely low intraspecific genetic diversity. Assessed under natural conditions, the genome-wide spontaneous mutation rate is at least seven times lower than estimates made for other multicellular eukaryotes, whereas Ne is large. These results demonstrate that low genetic diversity can be associated with large-Ne species, where selection can reduce mutation rates to very low levels. This study also highlights that accurate estimates of mutation rate can help to explain seemingly unexpected patterns of genome-wide variation.Entities:
Mesh:
Year: 2019 PMID: 30886148 PMCID: PMC6423293 DOI: 10.1038/s41467-019-09235-5
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Nucleotide diversity, population structure and linkage disequilibrium in S. polyrhiza. a Geographic distribution of the 68 sequenced samples, colored according to population structure. The insert at the lower left corner shows the results from the STRUCTURE analysis using genome-wide polymorphisms. Each colored line refers to an individual and the Y-axis refers to the likelihood of membership to each cluster. Genome wide πs refers to average pairwise nucleotide diversity at synonymous sites. SE: Southeast. b Principal coordinate analysis (PCA) based on genome-wide nucleotide diversity data. Average pairwise nucleotide diversity (π) calculated from all sites is shown for each population. c Decay of linkage disequilibrium (LD) with physical distance in four populations. The dashed line indicates an LD value of r2 = 0.33. Data are deposited in figshare[58]
Summary of the sequencing data and detected mutations
| Sample ID | Treatment | # Mutations | Callable sites (Mb) |
|---|---|---|---|
| A | Indoor | 0 | 126.4 |
| E | Indoor | 0 | 125.7 |
| I | Indoor | 0 | 126.0 |
| J | Indoor | 0 | 126.4 |
| N | Indoor | 0 | 125.9 |
| B | Outdoor-noUV | 0 | 126.1 |
| G | Outdoor-noUV | 1 | 126.3 |
| K | Outdoor-noUV | 0 | 124.2 |
| O | Outdoor-noUV | 0 | 125.9 |
| P | Outdoor-noUV | 0 | 126.4 |
| C | Outdoor-UV | 1 | 125.6 |
| D | Outdoor-UV | 1 | 126.3 |
| L | Outdoor-UV | 1 | 126.3 |
| M | Outdoor-UV | 0 | 126.3 |
| Q | Outdoor-UV | 0 | 126.0 |
Each row shows the sample information and number of verified mutations. Effective sites are estimated as the total number of sites with sufficient coverage for finding de novo variants using our pipeline. The mutation rate is calculated as μ = (number of mutations/sum of effective sites)/number of generations. The average mutation rates (95% confidence interval) for samples grown under indoor, outdoor-noUV and outdoor-UV conditions are: <7.92 × 10−11 (NA), 7.92 × 10−11 (2.07 × 10−11 to 3.98 × 10−10), and 2.38 × 10−10 (4.76 × 10−11 to 7.30 × 10−10), respectively. The 95% confidence intervals were calculated based on the assumption that the number of mutations is Poisson distributed
Fig. 2Estimated mutation rates in protein-coding regions among different organisms. The violin plots of log10-transformed numbers of mutations per base pair of protein-coding genome sequences (CDS) per generation for eubacteria, unicellular eukaryotes and multicellular eukaryotes, respectively. The kernel probability density is shown. Each circle indicates the estimate for one species. The arrow highlights the mutation rate in S. polyrhiza. Except for the mutation rate in S. polyrhiza, the plotted data were extracted from previous studies (Supplementary Data 3)