| Literature DB >> 27301592 |
Thomas J Y Kono1, Fengli Fu2, Mohsen Mohammadi3, Paul J Hoffman2, Chaochih Liu2, Robert M Stupar2, Kevin P Smith2, Peter Tiffin4, Justin C Fay5, Peter L Morrell1.
Abstract
Populations continually incur new mutations with fitness effects ranging from lethal to adaptive. While the distribution of fitness effects of new mutations is not directly observable, many mutations likely either have no effect on organismal fitness or are deleterious. Historically, it has been hypothesized that a population may carry many mildly deleterious variants as segregating variation, which reduces the mean absolute fitness of the population. Recent advances in sequencing technology and sequence conservation-based metrics for inferring the functional effect of a variant permit examination of the persistence of deleterious variants in populations. The issue of segregating deleterious variation is particularly important for crop improvement, because the demographic history of domestication and breeding allows deleterious variants to persist and reach moderate frequency, potentially reducing crop productivity. In this study, we use exome resequencing of 15 barley accessions and genome resequencing of 8 soybean accessions to investigate the prevalence of deleterious single nucleotide polymorphisms (SNPs) in the protein-coding regions of the genomes of two crops. We conclude that individual cultivars carry hundreds of deleterious SNPs on average, and that nonsense variants make up a minority of deleterious SNPs. Our approach identifies known phenotype-altering variants as deleterious more frequently than the genome-wide average, suggesting that putatively deleterious variants are likely to affect phenotypic variation. We also report the implementation of a SNP annotation tool BAD_Mutations that makes use of a likelihood ratio test based on alignment of all currently publicly available Angiosperm genomes.Entities:
Keywords: bioinformatics.; crops; deleterious mutations; resequencing
Mesh:
Year: 2016 PMID: 27301592 PMCID: PMC4989107 DOI: 10.1093/molbev/msw102
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Mean Numbers of SNPs in Various Classes.
| Species | Diff. from Ref. | Noncoding | Syn. | Nonsyn. | Nonsense |
|---|---|---|---|---|---|
| Barley | 162,954 (51,231.34) | 115,456 (41,065.22) | 15,591 (5,691.81) | 12,351 (4,492.53) | 77 (33.13) |
| Soybean | 82,840 (56,780.03) | 44,704 (29,477.65) | 14,167 (8,161.21) | 18,695 (11,289.72) | 540 (345.05) |
Syn., Synonymous; Nonsyn., Nonsynonymous. Numbers are mean (SD)
Fig. 1.Derived allele (unfolded) frequency spectra for coding regions showing deleterious, tolerated, and synonymous SNPs for barley and soybean. Ancestral state was inferred as described in the Methods. A variant was called “Deleterious” if it was nonsynonymous and predicted to be deleterious by SIFT, PolyPhen2, and the LRT. (A) is based on 13 domesticated barley accessions and 2 wild accessions while (B) is based on seven cultivated soybean accessions and one wild accession.
Mean Counts of Nonsynonymous Variants That Are Predicted to Be Deleterious by Three Prediction Methods.
| Species | SIFT | PPH | LRT | Intersect |
|---|---|---|---|---|
| Barley | 3,400 (0.192) | 3,295 (0.186) | 3,221 (0.183) | 1,006 (0.057) |
| Soybean | 1,972 (0.064) | 3,881 (0.126) | 3,135 (0.101) | 784 (0.025) |
Numbers in parentheses are proportions of all nonsynonymous variants in each sample that are predicted to be deleterious.
Fig. 2.Derived allele (unfolded) frequency spectra for SNPs in (A) barley and (B) soybean predicted to be deleterious by one, two, or three prediction approaches. SNPs predicted by only one approach are not as strongly skewed toward rare variants, suggesting that the intersection of multiple prediction approaches gives the most reliable prediction of deleterious variants.
Fig. 3.Comparison between recombination rate, CDS diversity, and proportion of nonsynonymous SNPs inferred to be deleterious in soybean on chromosome 1. The top panel shows the proportion of nonsynonymous SNPs that were inferred to be deleterious, in windows defined by genetic map distance (Lee et al. 2015). The bottom panel shows recombination rate in cM/Mb (black line) and average pairwise nucleotide sequence diversity per kilobase in CDS (blue line). Dashed vertical lines represent the boundaries of the annotated pericentromeric region, which has much lower recombination rates than the euochromatic regions.