| Literature DB >> 26253317 |
Zackery E Plyler1, Aubrey E Hill2, Christopher W McAtee3, Xiangqin Cui4, Leah A Moseley3, Eric J Sorscher5.
Abstract
In this study, we show novel DNA motifs that promote single nucleotide polymorphism (SNP) formation and are conserved among exons, introns, and intergenic DNA from mice (Sanger Mouse Genomes Project), human genes (1000 Genomes), and tumor-specific somatic mutations (data from TCGA). We further characterize SNPs likely to be very recent in origin (i.e., formed in otherwise congenic mice) and show enrichment for both synonymous and parallel DNA variants occurring under circumstances not attributable to purifying selection. The findings provide insight regarding SNP contextual bias and eukaryotic codon usage as strategies that favor long-term exonic stability. The study also furnishes new information concerning rates of murine genomic evolution and features of DNA mutagenesis (at the time of SNP formation) that should be viewed as "adaptive."Entities:
Keywords: SNP formation bias; mutation; parallel evolution
Mesh:
Substances:
Year: 2015 PMID: 26253317 PMCID: PMC4607513 DOI: 10.1093/gbe/evv150
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FBase frequency bias of murine A⇔G SNPs. (A) Base location frequency bias (Bias %) immediately surrounding (±4 bp) 20,603 exonic A⇔G homozygous SNPs, compiled from coding regions of murine chromosomes 1–8. Bias % was calculated as described in Materials and Methods. (B) Genomic base frequency bias (±4 bp) relative to 20,000 randomly chosen (not SNP-associated) adenine (A) and guanine (G) nucleotides within murine exons (chromosomes 1–4) studied in a fashion otherwise identical to Panel (A). (C) Base location frequency bias immediately surrounding (±4 bp) 50,244 A⇔G SNPs compiled from intronic regions of murine chromosomes 1–3. (D) Genomic base frequency bias relative to 50,000 randomly chosen “A” or “G” sites within murine introns. (E) Base location frequency bias relative to 67,663 A⇔G SNPs compiled from intergenic regions on murine chromosomes 1–3. (F) Genomic base frequency bias relative to 50,000 randomly chosen “A” or “G” sites from intergenic regions of murine chromosome 2. In all cases, standard deviation (as judged by bootstrap analysis) was very low (on the order of ∼0.1–0.3%).
Overrepresented (Permissive) 5′-XC|XX-3′ Quartets Surrounding A⇔G Transition SNPs
Note.—(A) Statistically overrepresented dinucleotide quartets surrounding 20,603 A⇔G coding (exonic) SNPs from murine chromosomes 1–8 (vertical line in each quartet indicates position of polymorphic base). Dinucleotide quartets were defined as two base pairs upstream and two downstream of a polymorphic site or other nucleotide position being evaluated. Rankings that strongly overlap between exonic murine SNPs and other murine and human SNP categories (quartets from among 256 possibilities significantly overrepresented in three of four murine and human DNA compartments) are indicated by yellow highlight; (B) same analysis for 50,244 intronic A⇔G SNPs from murine chromosomes 1–3; (C) findings for 67,663 intergenic A⇔G SNPs for murine chromosomes 1–3; (D) findings for 6,419 A⇔G SNPs from introns of 22 human genes. “—,” nonoverlapping in range shown. Note increased incidence of 5′ cytosine and 3′ thymine in quartet motifs predictive of SNP location.
Nonpermissive 5′-XC|XX-3′ Quartets Surrounding A⇔G Transition SNPs
Note.—Nonpermissive 5′-XC|XX-3′ quartets in the setting of (A) 20,603 exonic (chromosomes 1–8), (B) 50,244 intronic (chromosomes 1–4), and (C) 67,663 intergenic (chromosomes 1–4) A⇔G homozygous SNPs. Vertical line in each quartet indicates position of polymorphic base. Yellow highlight indicates nonpermissiveness for SNP formation in at least two of the three regions shown in (A)–(C). Numerous contexts with cytosine immediately 5′ do not predict the location of an A⇔G SNP. All quartet P values are greater than 0.05 (Compare with table 1).
FSNP-permissive and shielding quartets in human ORFs. (A) Most frequent SNP-permissive (yellow) and SNP-shielding (red) A⇔G quartet contexts within the cDNA of CFTR. (B) Same analysis for cDNA of dystrophin. A statistical (chi square 2 × 2 contingency table) analysis indicated overrepresentation of shielding dinucleotide quartets in both genes (P < 0.00001), and underrepresentation of quartets permissive for SNP formation (P < 0.00001).
Comparison of Transition: Transversion and Nonsynonymous:Synonymous SNP Frequencies in Homozygous (Homosite) versus Heterozygous (Heterosite) Murine SNPs
| SNP Category | Percent Transition | Percent Transversion | Nonsynonymous (NS) Coding SNPs | Synonymous (S) Coding SNPs | NS:S ( |
|---|---|---|---|---|---|
| Homozygous | 77.7 | 22.3 | 708 | 1,245 | 1:1.76 (1.65E-14) |
| Heterozygous | 66.3 | 33.7 | 272 | 425 | 1:1.56 (2.64E-11) |
Note.—Murine homosites from chromosomes 1–3 with at least four lines exhibiting the minor allele are shown.
aP values represent observed SNP frequencies versus those that would be expected if SNPs formed stochastically (Materials and Methods).