| Literature DB >> 28549175 |
Péter Szövényi1, Kristian K Ullrich2,3, Stefan A Rensing2,4, Daniel Lang5, Nico van Gessel6, Hans K Stenøien7, Elena Conti1, Ralf Reski4,6.
Abstract
A long-term reduction in effective population size will lead to major shift in genome evolution. In particular, when effective population size is small, genetic drift becomes dominant over natural selection. The onset of self-fertilization is one evolutionary event considerably reducing effective size of populations. Theory predicts that this reduction should be more dramatic in organisms capable for haploid than for diploid selfing. Although theoretically well-grounded, this assertion received mixed experimental support. Here, we test this hypothesis by analyzing synonymous codon usage bias of genes in the model moss Physcomitrella patens frequently undergoing haploid selfing. In line with population genetic theory, we found that the effect of natural selection on synonymous codon usage bias is very weak. Our conclusion is supported by four independent lines of evidence: 1) Very weak or nonsignificant correlation between gene expression and codon usage bias, 2) no increased codon usage bias in more broadly expressed genes, 3) no evidence that codon usage bias would constrain synonymous and nonsynonymous divergence, and 4) predominant role of genetic drift on synonymous codon usage predicted by a model-based analysis. These findings show striking similarity to those observed in AT-rich genomes with weak selection for optimal codon usage and GC content overall. Our finding is in contrast to a previous study reporting adaptive codon usage bias in the moss P. patens.Entities:
Keywords: codon usage; effective population size; genetic drift; inbreeding; moss; natural selection
Mesh:
Substances:
Year: 2017 PMID: 28549175 PMCID: PMC5507605 DOI: 10.1093/gbe/evx098
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Optimal Codons per Synonymous Codon Families Identified by Correlation Analysis between Overall Codon Usage Bias of Genes (ENC and ENC′) and the Frequency of Codons per Each Gene
| Correlation analysis | Correspondence analysis | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Using ENC | Using ENC′ | Microarray | RNA-seq | ||||||||||||
| Number of Corresponding tRNA Genes in | RGF | RGF Wobble | Amino Acids | Codons | Codons | RSCUhighly expressed | RSCUlowly expressed | ΔRSCU | RSCUhighly expressed | RSCUlowly expressed | ΔRSCU | Codons | High-Bias | Low-Bias | ΔRSCU |
| 19 | 2.1111 | Ala | GCT | GCT | 1.3400 | 1.2070 | 0.1329 | 1.3623 | 1.1526 | 0.2097 | GCT | 0.7200 | 1.5200 | −0.8000 | |
| 0 | 0.0000 | 1.5833 | GCC | 1.0127 | 0.8038 | 0.2088 | 1.0113 | 0.8775 | 0.1338 | GCC* | 1.1600 | 0.5500 | 0.6100 | ||
| 9 | 1.0000 | 0.7500 | GCA | GCA | 0.9004 | 1.1673 | −0.2669 | 0.8923 | 1.1124 | −0.2201 | GCA | 0.6000 | 1.5700 | −0.9700 | |
| 8 | 0.8889 | 0.6667 | GCG | 0.7470 | 0.8218 | 0.7341 | 0.8575 | 1.5200 | 0.3700 | ||||||
| 8 | 1.2966 | Arg | CGT | CGT | 0.9886 | 0.7943 | 0.1943 | 1.0110 | 0.7264 | 0.2846 | CGT | 0.6000 | 0.7600 | −0.1600 | |
| 1 | 0.1621 | 1.2162 | CGC | 1.1197 | 0.9054 | 0.2143 | 1.0981 | 0.9755 | 0.1227 | 1.4300 | 0.3500 | 1.0800 | |||
| 7 | 1.1345 | 0.9459 | CGA | CGA | 0.8164 | 1.0329 | −0.2164 | 0.8310 | 1.0296 | −0.1985 | CGA | 0.6200 | 0.9400 | −0.3200 | |
| 5 | 0.8104 | 0.6757 | CGG | 0.8206 | 0.9338 | 0.8232 | 0.9386 | 1.4500 | 0.4600 | ||||||
| 6 | 0.9724 | 0.8108 | AGA | AGA | 0.8939 | 1.1666 | −0.2727 | 0.8915 | 1.1695 | −0.2779 | AGA | 0.4800 | 2.3300 | −1.8500 | |
| 10 | 1.6207 | 1.3514 | 1.3607 | 1.1670 | 0.1937 | 1.3451 | 1.1605 | 0.1846 | 1.4100 | 1.1600 | 0.2500 | ||||
| 1 | 0.1333 | Asn | AAT | AAT | 0.8467 | 1.0670 | −0.2203 | 0.8713 | 0.9727 | −0.1014 | AAT | 0.5200 | 1.3400 | −0.8200 | |
| 14 | 1.8667 | 2.0000 | 1.1533 | 0.9330 | 0.2203 | 1.1287 | 1.0273 | 0.1014 | 1.4800 | 0.6600 | 0.8200 | ||||
| 0 | 0.0000 | Asp | GAT | GAT | 1.0151 | 1.0854 | −0.0703 | 1.0033 | 1.0205 | −0.0172 | GAT | 0.6700 | 1.4100 | −0.7400 | |
| 11 | 2.0000 | 2.0000 | 0.9849 | 0.9146 | 0.0703 | 0.9967 | 0.9795 | 0.0172 | 1.3300 | 0.5900 | 0.7400 | ||||
| 0 | 0.0000 | Cys | TGT | TGT | 0.7898 | 0.9278 | −0.1380 | 0.6906 | 0.8315 | −0.1409 | TGT | 0.4300 | 1.2100 | −0.7800 | |
| 7 | 2.0000 | 2.0000 | 1.2102 | 1.0722 | 0.1380 | 1.3094 | 1.1685 | 0.1409 | 1.5700 | 0.7900 | 0.7800 | ||||
| 9 | 0.9474 | 0.9474 | Gln | CAA | CAA | 0.7657 | 0.9658 | −0.2001 | 0.7837 | 0.9190 | −0.1352 | CAA | 0.4500 | 1.3300 | −0.8800 |
| 10 | 1.0526 | 1.0526 | 1.2343 | 1.0342 | 0.2001 | 1.2163 | 1.0810 | 0.1352 | 1.5500 | 0.6700 | 0.8800 | ||||
| 7 | 0.6667 | 0.6667 | Glu | GAA | GAA | 0.7014 | 0.8470 | −0.1456 | 0.7219 | 0.8236 | −0.1017 | GAA | 0.4400 | 1.1800 | −0.7400 |
| 14 | 1.3333 | 1.3333 | 1.2986 | 1.1530 | 0.1456 | 1.2781 | 1.1764 | 0.1017 | 1.5600 | 0.8200 | 0.7400 | ||||
| 0 | 0.0000 | Gly | GGT | GGT | 1.1190 | 0.9676 | 0.1514 | 1.1101 | 0.9205 | 0.1896 | GGT | 0.6400 | 1.2100 | −0.5700 | |
| 15 | 1.7143 | 1.2864 | GGC | 0.9617 | 0.9708 | 0.9759 | 1.0439 | −0.0680 | 1.2400 | 0.7300 | |||||
| 9 | 1.0286 | 0.7719 | GGA | GGA | 1.2212 | 1.1394 | 0.0818 | 1.2122 | 1.1498 | 0.0623 | GGA | 0.9500 | 1.4600 | −0.5100 | |
| 11 | 1.2571 | 0.9434 | GGG | 0.6980 | 0.9222 | 0.7018 | 0.8858 | 1.1600 | 0.6000 | ||||||
| 0 | 0.0000 | His | CAT | CAT | 0.8417 | 1.0529 | −0.2112 | 0.8391 | 0.9619 | −0.1228 | CAT | 0.6100 | 1.4100 | −0.8000 | |
| 9 | 2.0000 | 2.0000 | 1.1583 | 0.9471 | 0.2112 | 1.1609 | 1.0381 | 0.1228 | 1.3900 | 0.5900 | 0.8000 | ||||
| 12 | 2.2514 | Ile | ATT | ATT | 1.3376 | 1.2370 | 0.1007 | 1.3250 | 1.2036 | 0.1214 | ATT | 0.8000 | 1.3000 | −0.5000 | |
| 0 | 0.0000 | 1.5000 | 1.3535 | 1.1231 | 0.2304 | 1.3549 | 1.2422 | 0.1128 | 1.9800 | 0.5500 | 1.4300 | ||||
| 4 | 0.7505 | 0.5000 | ATA | ATA | 0.3088 | 0.6399 | −0.3311 | 0.3201 | 0.5543 | −0.2342 | ATA | 0.2200 | 1.1500 | −0.9300 | |
| 2 | 0.4796 | 0.4000 | Leu | TTA | TTA | 0.3412 | 0.5837 | −0.2425 | 0.3684 | 0.5152 | −0.1468 | TTA | 0.1300 | 1.2000 | −1.0700 |
| 6 | 1.4388 | 1.2000 | TTG | TTG | 1.7998 | 1.4873 | 0.3125 | 1.8150 | 1.5153 | 0.2997 | TTG | 1.4400 | 2.1200 | −0.6800 | |
| 6 | 1.4388 | CTT | CTT | 1.1936 | 1.1819 | 0.0118 | 1.1359 | 1.0768 | 0.0591 | CTT | 0.5400 | 1.1900 | −0.6500 | ||
| 1 | 0.2398 | 1.4000 | CTC | 1.0424 | 0.9544 | 0.0880 | 1.0535 | 1.0195 | 0.0340 | 1.0700 | 0.4900 | 0.5800 | |||
| 5 | 1.1990 | 1.0000 | CTA | CTA | 0.3789 | 0.5042 | −0.1252 | 0.3761 | 0.4856 | −0.1095 | CTA | 0.2000 | 0.5700 | −0.3700 | |
| 5 | 1.1990 | 1.0000 | 1.2440 | 1.2886 | 1.2511 | 1.3876 | 2.6200 | 0.4300 | |||||||
| 8 | 0.6667 | 0.6667 | Lys | AAA | AAA | 0.5422 | 0.7975 | −0.2553 | 0.5538 | 0.7665 | −0.2127 | AAA | 0.3300 | 1.0400 | −0.7100 |
| 16 | 1.3333 | 1.3333 | 1.4578 | 1.2025 | 0.2553 | 1.4462 | 1.2335 | 0.2127 | 1.6700 | 0.9600 | 0.7100 | ||||
| 1 | 0.1333 | Phe | TTT | TTT | 0.8004 | 0.9593 | −0.1589 | 0.8164 | 0.8849 | −0.0686 | TTT | 0.4700 | 1.4000 | −0.9300 | |
| 14 | 1.8667 | 2.0000 | 1.1996 | 1.0407 | 0.1589 | 1.1836 | 1.1151 | 0.0686 | 1.5300 | 0.6000 | 0.9300 | ||||
| 13 | 2.0800 | Pro | CCT | CCT | 1.3234 | 1.2547 | 0.0687 | 1.3296 | 1.2101 | 0.1195 | CCT | 0.6800 | 1.4400 | −0.7600 | |
| 0 | 0.0000 | 1.5606 | CCC | 1.1467 | 0.8594 | 0.2873 | 1.1198 | 0.8554 | 0.2644 | 1.2300 | 0.4500 | 0.7800 | |||
| 8 | 1.2800 | 0.9604 | CCA | CCA | 0.9124 | 1.1249 | −0.2125 | 0.9516 | 1.0821 | −0.1305 | CCA | 0.5800 | 1.8100 | −1.2300 | |
| 4 | 0.6400 | 0.4802 | CCG | 0.6174 | 0.7609 | 0.5990 | 0.8524 | 1.5100 | 0.3000 | ||||||
| 13 | 2.2928 | Ser | TCT | TCT | 1.2234 | 1.1860 | 0.0374 | 1.2445 | 1.1178 | 0.1267 | TCT | 0.5800 | 1.5000 | −0.9200 | |
| 0 | 0.0000 | 1.5294 | TCC | 1.1210 | 0.9560 | 0.1650 | 1.1340 | 0.9735 | 0.1605 | 1.2400 | 0.5200 | 0.7200 | |||
| 5 | 0.8818 | 0.5882 | TCA | TCA | 0.8221 | 0.9374 | −0.1153 | 0.8223 | 0.9186 | −0.0962 | TCA | 0.4000 | 1.6400 | −1.2400 | |
| 7 | 1.2346 | 0.8235 | TCG | 0.9085 | 0.8898 | 0.0186 | 0.8819 | 0.9355 | −0.0536 | 1.7300 | 0.3500 | 1.3800 | |||
| 0 | 0.0000 | AGT | AGT | 0.8045 | 0.9291 | −0.1246 | 0.7460 | 0.8680 | −0.1220 | AGT | 0.4100 | 1.1800 | −0.7700 | ||
| 9 | 1.5873 | 1.0588 | 1.1205 | 1.1017 | 0.0188 | 1.1712 | 1.1867 | −0.0155 | 1.6400 | 0.8100 | 0.8300 | ||||
| 10 | 1.9048 | Thr | ACT | ACT | 1.2054 | 1.1451 | 0.0603 | 1.1815 | 1.0608 | 0.1208 | ACT | 0.5300 | 1.2900 | −0.7600 | |
| 0 | 0.0000 | 1.4286 | ACC | 1.1632 | 0.8676 | 0.2957 | 1.1404 | 0.9640 | 0.1764 | 1.2900 | 0.5600 | 0.7300 | |||
| 5 | 0.9524 | 0.7143 | ACA | ACA | 0.8646 | 1.1010 | −0.2364 | 0.8708 | 1.0371 | −0.1663 | ACA | 0.4500 | 1.8300 | −1.3800 | |
| 6 | 1.1429 | 0.8571 | ACG | 0.7668 | 0.8864 | 0.8073 | 0.9381 | 1.7200 | 0.3200 | ||||||
| 0 | 0.0000 | Tyr | TAT | TAT | 0.6841 | 0.9464 | −0.2622 | 0.7009 | 0.8084 | −0.1075 | TAT | 0.4600 | 1.3300 | −0.8700 | |
| 8 | 2.0000 | 2.0000 | 1.3159 | 1.0536 | 0.2622 | 1.2991 | 1.1916 | 0.1075 | 1.5400 | 0.6700 | 0.8700 | ||||
| 13 | 0.5000 | Val | GTT | GTT | 1.0106 | 1.0558 | −0.0452 | 1.0495 | 0.9711 | 0.0784 | GTT | 0.4600 | 1.4400 | −0.9800 | |
| 0 | 0.0000 | 1.5012 | GTC | 0.8330 | 0.7387 | 0.0942 | 0.8373 | 0.8203 | 0.0170 | 0.8100 | 0.5100 | 0.3000 | |||
| 2 | 0.0769 | 0.2309 | GTA | GTA | 0.4681 | 0.6084 | −0.1403 | 0.4552 | 0.5732 | −0.1180 | GTA | 0.2500 | 1.0000 | −0.7500 | |
| 11 | 0.4231 | 1.2702 | 1.6883 | 1.5970 | 0.0913 | 1.6580 | 1.6354 | 0.0226 | 2.4800 | 1.0600 | 1.4200 | ||||
Note.—RGF, relative gene frequency of tRNA genes in the genome (frequency of a codon per codon family/frequency assuming equal abundances of tRNA genes per codon family) not taking into account wobble rules; RGF wobble, relative gene frequency of tRNA genes taking into account revised wobble rules (codons were grouped according to wobble rules); RSCUlowly expressed, average RSCU in the 5% least expressed genes; RSCUhighly expressed, average RSCU in the 5% most highly expressed genes; ΔRSCU, RSCUhighly expressed − RSCUlowly expressed. Optimal codons identified in a previous study using correspondence analysis on RSCU are also shown along with the average RSCU values in the gene set with the highest and lowest codon usage bias (upper and lower 5%) (values are taken from the publication Szövényi et al. 2015). Optimal codons are labeled with an asterisk and optimal codons supported by more than one analysis are in bold. ΔRSCU values contradicting the translational/transcriptional hypothesis are underlined and in italics.
. 1.—Correlation between the number of tRNA genes in the P. patens genome and the relative proportion of the corresponding amino acids in the proteome. Amino acids are labelled with their three letter codes and their frequencies are weighted by the average expression of genes. For this figure, we used gene expression data from the microarray experiment.
. 2.—Correlation between the frequency of optimal codons (Fop) and gene expression (maximum value of RNA-seq expression estimate [FPKM]).
Partial Spearman Rank Correlation of Codon Bias Statistics and Genomic Variables
| Gene Expression | Expression Breadth (τ) | Gene Length | Intron Length | GC_CDS | GC_ third codon | GC_ intron | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Microarray | RNA-seq | Microarray | RNA-seq | |||||||||||||||||||||||||||||
| Rho | Rho | Rho | Rho | Rho | Rho | Rho | Rho | Rho | Rho | Rho | Rho | Rho | Rho | Rho | ||||||||||||||||||
| Codon usage bias statistic | ||||||||||||||||||||||||||||||||
| ENC | vs. | −0.1208 | 2.5380E- 18 | −0.0836 | 1.4840E- 09 | 0.0233 | 8.6452E- 02 | 0.0764 | 1.8114E- 08 | −0.0991 | 1.0410E- 11 | −0.0401 | 2.5855E- 02 | 0.1700 | 2.7720E- 32 | 0.1674 | 6.7550E- 21 | 0.1216 | 4.8000E- 17 | 0.0723 | 6.6000E- 05 | −0.0569 | 4.9886E- 05 | 0.0421 | 2.6759E- 03 | 0.1029 | 3.0190E- 14 | 0.1698 | 1.0606E- 36 | 0.1400 | 2.9156E- 25 | |
| vs. | 0.0647 | 3.9400E- 06 | 0.0387 | 5.3886E- 03 | −0.0050 | 7.1610E- 01 | 0.0511 | 1.6971E- 04 | 0.1240 | 8.1000E- 18 | 0.0898 | 8.7120E- 07 | −0.0790 | 8.8760E- 08 | −0.0651 | 3.6670E- 04 | −0.0656 | 7.6971E- 06 | −0.0512 | 5.1389E- 03 | −0.0456 | 1.2025E- 03 | −0.0235 | 9.1829E- 02 | 0.4273 | 8.2376E- 264 | 0.9766 | 0.0000E+ 00 | 0.3879 | 1.1119E- 209 | ||
| ENC_prime | vs. | −0.0536 | 1.9512E- 04 | −0.0180 | 2.2357E- 01 | −0.0456 | 1.3748E- 03 | −0.0405 | 2.8932E- 03 | −0.1253 | 1.4040E- 17 | −0.0446 | 1.5730E- 02 | 0.1013 | 5.4600E- 12 | 0.0634 | 6.8209E- 04 | 0.0738 | 7.0500E- 07 | 0.0105 | 6.0351E- 01 | −0.0396 | 5.4170E- 03 | 0.0925 | 3.6760E- 11 | −0.1098 | 4.9028E- 16 | −0.0342 | 1.1883E- 02 | −0.0479 | 4.2922E- 04 | |
Note.—Rank correlations presented are pairwise partial rank correlations in which all the other variables listed in the table were included as covariates. Rho, Spearman’s partial rank correlation statistic; Ks, number of synonymous substitutions per synonymous sites between P. patens and C. purpureus GG1; Ka, number of nonsynonymous substitutions per nonsynonymous sites between P. patens and C. purpureus GG1; Gene length, physical length of genes in nucleotides in the genome; Intron length, physical length of the introns in nucleotides in the genome; GC_CDS, GC content of coding regions; GC_thirdcodon, GC content of third codon positions; GC_intron, GC content of introns; p, Benjamini–Hochberg false discovery rate.
. 3.—Correlation between the frequency of optimal codons (Fop) and the number of synonymous substitutions per synonymous sites (Ks, calculated in comparison with the orthologous C. purpureus proteins).
. 4.—Correlation between gene expression breadth (Tau [τ]) and the frequency of optimal codons (Fop).
. 5.—Histogram of SCU estimates (per gene selection intensity on codon usage).