| Literature DB >> 29036491 |
Juraj Bergman1,2, Andrea J Betancourt1, Claus Vogl3.
Abstract
In many organisms, local deviations from Chargaff's second parity rule are observed around replication and transcription start sites and within intron sequences. Here, we use expression data as well as a whole-genome data set of nearly 200 haplotypes to investigate such compositional skews in Drosophila melanogaster genes. We find a positive correlation between compositional skew and gene expression, comparable in strength to similar correlations between expression levels and genome-wide sequence features. This correlation is relatively stronger for germline, compared with somatic expression, consistent with the process of transcription-associated mutation bias. We also inferred mutation rates from alleles segregating at low frequencies in short introns, and show that, whereas the overall GC content of short introns does not conform to the equilibrium expectation, the level of the observed deviation from the second parity rule is generally consistent with the inferred rates.Entities:
Keywords: Chargaff’s second parity rule; base composition evolution; compositional skew; transcription-associated mutation bias
Mesh:
Year: 2018 PMID: 29036491 PMCID: PMC5786239 DOI: 10.1093/gbe/evx200
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 4.065
. 1.—Pearson’s coefficients (with 95% CIs) for the correlations between compositional skew and gene expression across different tissues and developmental stages. (A–C) Correlation of CG skew and gene expression. (D–F) Correlation of TA skew and gene expression. Although 0- to 2-h expression in embryos most likely reflects maternal transcription, which should not necessarily affect germline development, the correlation between maternal expression and later putative zygotic expression (2–4 h) is strong (Spearman’s ρ = 0.937, P < 0.001).
Skew Values with Their Corresponding 95% CIs (in Square Brackets) Calculated from Concatenating Introns in Genes with the 10% Highest and 10% Lowest Expression in Different Germline Tissues (Number of Genes in Each Category is n = 141)
| Ovary Expression | Testes Expression | |||
|---|---|---|---|---|
| High | Low | High | Low | |
| 4.19 | 0.67 | 4.33 | 0.55 | |
| [3.24, 5.17] | [−0.04, 1.36] | [3.18, 5.39] | [−0.08, 1.19] | |
| 2.31 | −0.45 | 2.58 | −0.16 | |
| [1.54, 3.06] | [−0.99, 0.15] | [1.75, 3.39] | [−0.68, 0.35] | |
Mutation Rates q from Nucleotide i to j with the Corresponding 95% CIs, Estimated from the Coding Strand of Autosomal Short Introns
| A → C | 0.0048 (308/64,053) | 0.0043–0.0053 |
| A → G | 0.0120 (776/64,521) | 0.0112–0.0128 |
| A → T | 0.0110 (709/64,454) | 0.0102–0.0118 |
| C → A | 0.0181 (615/33,886) | 0.0167–0.0195 |
| C → G | 0.0080 (270/33,541) | 0.0071–0.0089 |
| C → T | 0.0293 (1,008/34,359) | 0.0276–0.0310 |
| G → A | 0.0324 (957/29,556) | 0.0304–0.0344 |
| G → C | 0.0089 (256/28,885) | 0.0079–0.0099 |
| G → T | 0.0175 (508/29,107) | 0.0161–0.0190 |
| T → A | 0.0100 (667/66,779) | 0.0093–0.0107 |
| T → C | 0.0113 (753/66,885) | 0.0105–0.0121 |
| T → G | 0.0047 (314/66,446) | 0.0042–0.0052 |
Note.—F is the frequency of singletons of type j with major allele i and M is the sum of the frequency of sites fixed for nucleotide i and the frequency of singletons of type F.
. 2.—Distributions of skew estimates and nucleotide content obtained from 10,000 independent parameter search runs, conditional on the observed compositional skew in autosomal short introns and the 95% CIs of mutation rates in table 1. (A and B) Distributions of CG and TA skew, respectively; the red dashed line is the observed skew level. (C) The distribution of G (red) and C (black) content; the dashed lines are the observed values. (D) The distribution of A (red) and T (black) content; the dashed lines are the observed values.