| Literature DB >> 23204387 |
Jose L Campos1, Kai Zeng, Darren J Parker, Brian Charlesworth, Penelope R Haddrill.
Abstract
Codon usage bias (CUB) in Drosophila is higher for X-linked genes than for autosomal genes. One possible explanation is that the higher effective recombination rate for genes on the X chromosome compared with the autosomes reduces their susceptibility to Hill-Robertson effects, and thus enhances the efficacy of selection on codon usage. The genome sequence of D. melanogaster was used to test this hypothesis. Contrary to expectation, it was found that, after correcting for the effective recombination rate, CUB remained higher on the X than on the autosomes. In contrast, an analysis of polymorphism data from a Rwandan population showed that mean nucleotide site diversity at 4-fold degenerate sites for genes on the X is approximately three-quarters of the autosomal value after correcting for the effective recombination rate, compared with approximate equality before correction. In addition, these data show that selection for preferred versus unpreferred synonymous variants is stronger on the X than the autosomes, which accounts for the higher CUB of genes on the X chromosome. This difference in the strength of selection does not appear to reflect the effects of dominance of mutations affecting codon usage, differences in gene expression levels between X and autosomes, or differences in mutational bias. Its cause therefore remains unexplained. The stronger selection on CUB on the X chromosome leads to a lower rate of synonymous site divergence compared with the autosomes; this will cause a stronger upward bias for X than A in estimates of the proportion of nonsynonymous mutations fixed by positive selection, for methods based on the McDonald-Kreitman test.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23204387 PMCID: PMC3603305 DOI: 10.1093/molbev/mss222
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Variables Analyzed for the Full and Overlap Region Data Sets.
| X | A | ||
|---|---|---|---|
| 1,545 | 7,679 | ||
| 2.08 (2.05–2.11) | 1.39 (1.37–1.40) | ||
| 0.551 (0.546–0.555) | 0.518 (0.516–0.520) | ||
| 0.688 (0.683–0.692) | 0.641 (0.639–0.643) | ||
| 0.393 (0.387–0.400) | 0.352 (0.349–0.355) | ||
| 0.00130 (0.00122–0.00137) | 0.00162 (0.00157–0.00166) | ||
| 0.0152 (0.0147–0.0157) | 0.0159 (0.0156–0.0162) | 0.675 | |
| 0.0203 (0.00196–0.0021) | 0.0159 (0.0156–0.0162) | ||
| 0.040 (0.037–0.042) | 0.038 (0.037–0.039) | 0.069 | |
| 0.240 (0.236–0.244) | 0.248 (0.246–0.250) | ||
| Overall exp. | 9.90 (9.80–10.0) | 9.78 (9.73–9.83) | 0.206 |
| Female exp. | 9.09 (8.90–9.27) | 8.30 (8.21–8.39) | |
| Male exp. | 9.45 (9.33–9.58) | 9.50 (9.44–9.56) | 0.204 |
| CDS length | 538 (514–563) | 493 (484–502) | |
| oX | oA | ||
| 569 | 6,035 | ||
| Rec. | 1.61 (1.58–1.63) | 1.61 (1.60–1.62) | 0.606 |
| 0.558 (0.551–0.566) | 0.519 (0.516–0.521) | ||
| 0.698 (0.690–0.705) | 0.642 (0.640–0.644) | ||
| 0.418 (0.408–0.430) | 0.351 (0.348–0.354) | ||
| 0.00123 (0.0011–0.00136) | 0.00177 (0.00172–0.00182) | ||
| 0.0129 (0.0121–0.0135) | 0.0181 (0.0178–0.0184) | ||
| 0.0171 (0.0163–0.0180) | 0.0181 (0.0178–0.0184) | 0.061 | |
| 0.041 (0.037–0.044) | 0.038 (0.037–0.039) | ||
| 0.238 (0.231–0.244) | 0.248 (0.246–0.250) | ||
| Overall exp. | 9.88 (9.70–10.04) | 9.78 (9.72–9.84) | 0.508 |
| Female exp. | 9.14 (8.86–9.40) | 8.28 (8.19–8.39) | |
| Male exp. | 9.32 (9.09–9.52) | 9.48 (9.41–9.55) | |
| CDS length | 541 (503–575) | 498 (488–509) | |
Note.—For each variable, we report the mean with 95% CIs in parentheses. We examined four regions: X, A, oX, and oA. P, adjusted P value of the Mann–Whitney U test for differences between X and A (italicized values show significant results P < 0.05); π4 corrected for the X are the raw values multiplied by 4/3; Rec, effective recombination rate (cM per MB times 2/3 for X and 1/2 for A); GC, GC content of third codon positions; GC, GC content of short introns (<80 bp); Exp.: gene expression as measured by log2 (mean RPKM + 1); CDS length, coding sequence length in number of amino acids.
Variables Analyzed for the Three Subsets of the Overlap Regions with Respect to Recombination Rate: Low (1–1.4 cM/Mb), Intermediate (1.40–1.75 cM/Mb), and High (1.75–2.1 cM/Mb).
| Low oX | Low oA | ||
|---|---|---|---|
| 167 | 1,089 | ||
| 1.21 (1.20–1.23) | 1.24 (1.23–1.24) | 0.133 | |
| 0.596 (0.584–0.608) | 0.508 (0.502–0.513) | ||
| 0.741 (0.731–0.753) | 0.629 (0.623–0.635) | ||
| 0.477 (0.459–0.494) | 0.345 (0.338–0.353) | ||
| 0.00118 (0.00092–0.00139) | 0.00173 (0.00161–0.00185) | ||
| 0.0103 (0.0092–0.0114) | 0.0153 (0.0147–0.0159) | ||
| 0.0137 (0.0123–0.0151) | 0.0153 (0.0147–0.0159) | 0.115 | |
| 0.039 (0.033–0.045) | 0.039 (0.037–0.042) | 0.504 | |
| 0.226 (0.215–0.237) | 0.249 (0.244–0.254) | ||
| Overall exp. | 10.19 (9.90–10.50) | 9.71 (9.57–9.86) | |
| Female exp. | 9.70 (9.22–10.17) | 7.94 (7.70–8.19) | |
| Male exp. | 9.73 (9.40–10.04) | 9.22 (9.06–9.39) | 0.184 |
| CDS length | 548 (463–621) | 504 (477–532) | 0.270 |
| Intermediate oX | Intermediate oA | ||
| 193 | 3,195 | ||
| 1.58 (1.56–1.59) | 1.59 (1.59–1.59) | 0.162 | |
| 0.564 (0.554–0.575) | 0.527 (0.523–0.530) | ||
| 0.708 (0.698–0.719) | 0.652 (0.648–0.655) | ||
| 0.431 (0.415–0.444) | 0.357 (0.352–0.361) | ||
| 0.00116 (0.00095–0.00134) | 0.00172 (0.00165–0.00179) | ||
| 0.0127 (0.0115–0.0137) | 0.0179 (0.0175–0.0183) | ||
| 0.0169 (0.0154–0.0184) | 0.0179 (0.0175–0.0183) | 0.298 | |
| 0.041 (0.035–0.046) | 0.037 (0.036–0.038) | 0.097 | |
| 0.245 (0.234–0.258) | 0.244 (0.241–0.247) | 0.853 | |
| Overall exp. | 9.62 (9.33–9.91) | 9.77 (9.68–9.85) | 0.399 |
| Female exp. | 8.83 (8.38–9.28) | 8.33 (8.18–8.46) | 0.188 |
| Male exp. | 8.98 (8.60–9.39) | 9.49 (9.39–9.59) | |
| CDS length | 503 (454–549) | 500 (485–514) | 0.130 |
| High oX | High oA | ||
| 209 | 1,751 | ||
| 1.95 (1.94–1.97) | 1.88 (1.88–1.89) | ||
| 0.523 (0.509–0.536) | 0.511 (0.507–0.515) | 0.133 | |
| 0.653 (0.642–0.665) | 0.633 (0.628–0.637) | ||
| 0.352 (0.335–0.369) | 0.345 (0.341–0.351) | 0.342 | |
| 0.00133 (0.00111–0.00155) | 0.00188 (0.00178–0.00198) | ||
| 0.0151 (0.0138–0.0162) | 0.0203 (0.0198–0.0208) | ||
| 0.0201 (0.0184–0.0216) | 0.0203 (0.0197–0.0209) | 0.908 | |
| 0.042 (0.036–0.048) | 0.040 (0.038–0.042) | 0.417 | |
| 0.240 (0.227–0.252) | 0.254 (0.250–0.258) | ||
| Overall exp. | 9.87 (9.59–10.2) | 9.86 (9.75–9.97) | 0.997 |
| Female exp. | 8.97 (8.49–9.46) | 8.42 (8.23–8.60) | 0.069 |
| Male exp. | 9.30 (8.96–9.63) | 9.61 (9.48–9.75) | 0.096 |
| CDS length | 570 (503–634) | 490 (470–511) | |
Note.—P, adjusted P value of the Mann-Whitney U test for differences between X and A (italicized values show significant results, P < 0.05).
FPairwise relationships between several genomic variables. The variables considered are CUB (Fop), effective recombination rate (Rec), CDS length, overall gene expression, and GC content in short introns (GC). The relationships between these variables are investigated in four different data sets: oA, autosomal genes in the overlap region; oX, X-linked genes in the overlap region; A, autosomal genes in the full data set which spans the full range of effective recombination rates; and X, X-linked genes in the full data set. We plot the Loess regression lines for each data set and pairwise comparison. We show the Spearman’s rank correlation coefficients and their significance (***P < 0.001; **P < 0.01; *P < 0.05).
Relationships between Pairs of Variables Affecting CUB.
| Pair of Variables | Region | Correlates | |||
|---|---|---|---|---|---|
| X | A | oX | oA | ||
| −0.077 ( | −0.009 (0.568) | −0.315 ( | −0.022 (0.127) | Exp., | |
| (−0.140/−0.015) | (−0.037/0.017) | (−0.411/−0.222) | (−0.052/0.012) | ||
| −0.303 ( | −0.027 (0.120) | −0.500 ( | −0.026 (0.168) | None | |
| (−0.362/−0.247) | (−0.053/−0.002) | (−0.582/−0.427) | (−0.055/0.005) | ||
| 0.260 ( | 0.273 ( | 0.150 ( | 0.269 ( | ||
| (0.200/0.322) | (0.247/0.298) | (0.044/0.244) | (0.241/0.299) | ||
| −0.273 ( | −0.171 ( | −0.269 ( | −0.164 ( | ||
| (−0.337/−0.217) | (−0.198/−0.144) | (−0.369/−0.175) | (−0.199/−0.133) | ||
| 0.242 ( | 0.310 ( | 0.235 ( | 0.298 ( | ||
| (0.180/0.303) | (0.284/0.337) | (0.143/0.340) | (0.266/0.325) | ||
| Exp.∼ | 0.013 (0.68) | 0.007 (0.59) | 0.032 (0.53) | 0.015 (0.34) | |
| (−0.050/0.077) | (−0.022/0.034) | (−0.072/0.126) | (−0.019/0.048) | ||
Note.—Correlations among CUB (Fop), effective recombination rate (Rec), gene expression (Exp.), divergence levels (K0 and K4), and GC content in introns (GC). The covariates whose effects were controlled for are shown in the last column. We examined four regions: X, A, oX, and oA. Spearman’s rank partial correlation coefficients and their significance levels (italicized values show significant results, P < 0.05) are displayed in brackets, 95% CIs for the correlations are shown below in parentheses.
FEffective recombination rate versus 4-fold synonymous diversity (π4) for the autosomes and 4-fold synonymous diversity multiplied by 4/3 (π4 corrected) for the X chromosome. Bold lines represent Loess regression lines, in green for the autosomal genes and in red for the X chromosome genes. Dashed lines represent the CIs for the lines. The two vertical lines indicate the lower and upper ends of the overlap region.
Estimates of selection, mutation, and demographic parameters for the overlap region.
| Model | Parameter Estimates | ln | ||||||
|---|---|---|---|---|---|---|---|---|
| 1.70 | 1.53 | 0.0045 | 3.91 | 0.79 | — | — | −2,366,568.26 | |
| 1.53 | 1.36 | 0.0042 | 3.33 | 0.75 | 4.00 | 0.02 | −2,365,196.24 | |
| — | 1.50 | 0.0012 | 4.31 | 1.11 | 5.57 | 2.46 | −2,365,654.57 | |
| 1.39 | — | 0.0043 | 3.37 | 0.67 | 5.11 | 0.01 | −2,366,051.67 | |
Note.—γA = 4NesA and γX = 4λNesX, where Ne and λNe are the effective population sizes for autosomal and X-linked loci, respectively; sA and sX are the corresponding heterozygous selection coefficients.
FFrequency spectra at polymorphic synonymous sites for the overlap regions of the X chromosome (oX) and the autosomes (oA).