| Literature DB >> 31698749 |
Marc Krasovec1, Dmitry A Filatov1.
Abstract
Codon usage bias (CUB)-preferential use of one of the synonymous codons, has been described in a wide range of organisms from bacteria to mammals, but it has not yet been studied in marine phytoplankton. CUB is thought to be caused by weak selection for translational accuracy and efficiency. Weak selection can overpower genetic drift only in species with large effective population sizes, such as Drosophila that has relatively strong CUB, while organisms with smaller population sizes (e.g., mammals) have weak CUB. Marine plankton species tend to have extremely large populations, suggesting that CUB should be very strong. Here we test this prediction and describe the patterns of codon usage in a wide range of diatom species belonging to 35 genera from 4 classes. We report that most of the diatom species studied have surprisingly modest CUB (mean Effective Number of Codons, ENC = 56), with some exceptions showing stronger codon bias (ENC = 44). Modest codon bias in most studied diatom species may reflect extreme disparity between astronomically large census and modest effective population size (Ne), with fluctuations in population size and linked selection limiting long-term Ne and rendering selection for optimal codons less efficient. For example, genetic diversity (pi ~0.02 at silent sites) in Skeletonema marinoi corresponds to Ne of about 10 million individuals, which is likely many orders of magnitude lower than its census size. Still, Ne ~107 should be large enough to make selection for optimal codons efficient. Thus, we propose that an alternative process-frequent changes of preferred codons, may be a more plausible reason for low CUB despite highly efficient selection for preferred codons in diatom populations. The shifts in the set of optimal codons should result in the changes of the direction of selection for codon usage, so the actual codon usage never catches up with the moving target of the optimal set of codons and the species never develop strong CUB. Indeed, we detected strong shifts in preferential codon usage within some diatom genera, with switches between preferentially GC-rich and AT-rich 3rd codon positions (GC3). For example, GC3 ranges from 0.6 to 1 in most Chaetoceros species, while for Chaetoceros dichaeta GC3 = 0.1. Both variation in selection intensity and mutation spectrum may drive such shifts in codon usage and limit the observed CUB. Our study represents the first genome-wide analysis of CUB in diatoms and the first such analysis for a major phytoplankton group.Entities:
Keywords: codon bias; codon usage; diatoms; effective population size; phytoplankton evolution
Mesh:
Substances:
Year: 2019 PMID: 31698749 PMCID: PMC6896221 DOI: 10.3390/genes10110894
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1(a) Correlation between effective number of codons and frequency of optimal codons considering all genes in each species. Pearson’s correlation: rho = −0.475, P value = 6E-6; and (b) Correlation between the GC% of the optimal codons’ third position and the difference of GC% between all the coding genome and GC3s only. Pearson’s correlation: rho = 0.695, P value = 3E-13.
Figure 2Phylogenies of Chaetoceros (a); Pseudo-nitzschia (b); Skeletonema (c); and Thalassiosira (d) obtained by RaxML from the single orthologs identified by Orthofinder. DensiTree plots for these genera are shown in Figure S3.
Figure 3DensiTree of the 125 orthologs in Skeletonema genus. The two species assignments come from the A11 analysis with posterior probabilities [40,41].
Strength of selection only calculated with codons that did not show any change in preferred codon usage within a clade.
|
|
|
|---|---|
| MMETSP0327_ | 0.3224 *** |
| MMETSP0329_ | 0.5128 *** |
| MMETSP0853_ | 0.1069 *** |
| MMETSP1060_ | 0.1282 *** |
| MMETSP1061_ | 0.2265 *** |
| MMETSP1423_ | 0.0491 *** |
| MMETSP1432_ | 0.3986 *** |
|
| |
| MMETSP1070_ | 0.4625 *** |
| MMETSP1322_ | 0.5208 *** |
| MMETSP1434_ | −0.1424 ns |
|
| |
| MMETSP0321_ | 0.4990 *** |
| MMETSP0322_ | 0.5163 *** |
| MMETSP1362_ | 0.7312 *** |
|
| |
| MMETSP0404_ | 0.5433 *** |
| MMETSP0494_ | 0.2919 *** |
| MMETSP0740_ | 0.2284 *** |
| MMETSP0881_ | 0.7675 *** |
| MMETSP0905_ | 0.7422 *** |
| MMETSP0913_ | 0.6406 *** |
| MMETSP0973_ | 0.4032 *** |
| MMETSP1059_ | 0.5631 *** |
| MMETSP1067_ | 0.5752 *** |
| MMETSP1071_ | 0.6420 *** |
| MMETSP1422_ | 0.6491 *** |
|
| |
| MMETSP0013_ | 0.4699 *** |
| MMETSP0319_ | 0.5485 *** |
| MMETSP0320_ | 0.5096 *** |
| MMETSP0563_ | 0.4569 *** |
| MMETSP0578_ | 0.3390 *** |
| MMETSP0593_ | 0.3625 *** |
| MMETSP0604_ | 0.4057 *** |
| MMETSP0920_ | 0.4275 *** |
| MMETSP1039_ | 0.3182 *** |
| MMETSP1040_ | 0.2763 *** |
| MMETSP1428_ | 0.5165 *** |
*** means significantly high S compared to the rest of the genome according to Sharp and colleagues [10] method and ns means not significant.
Evolution of GC content at 3rd position of synonymous codons in the 4 diatom genera with >7 strains analysed.
| Genus | Species | GC3 of Preferred Codons | To Preferred | To Unpreferred | Total | GC Ancestral | GC | 2 × 2 |
|---|---|---|---|---|---|---|---|---|
| MMETSP0092_C.affinis_CCMP159 | 0.80 | 1524 | 2348 | 3872 | 36.39 | 52.22 | 0.0001 | |
| MMETSP0150_C.debilis_MM31A | 0.64 | 1906 | 745 | 2651 | 26.10 | 68.73 | 0.0001 | |
| MMETSP0200_C.GSL56 | 0.65 | 4609 | 2745 | 7354 | 27.89 | 55.22 | 0.0001 | |
| MMETSP0754_C.neogracile_CCMP1317 | 0.15 | 2713 | 2257 | 4970 | 50.99 | 33.02 | 0.0001 | |
| MMETSP1336_C.neogracile_RCC1993 | 0.61 | 4727 | 2302 | 7029 | 24.58 | 57.01 | 0.0001 | |
| MMETSP1429_C.UNC1202 | 0.68 | 4671 | 1957 | 6628 | 21.23 | 70.70 | 0.0001 | |
| MMETSP1435_C.brevis_CCMP164 | 0.70 | 1099 | 531 | 1630 | 22.09 | 75.28 | 0.0001 | |
| MMETSP1447_C.dichaeta_CCMP1751 | 0.10 | 3795 | 3405 | 7200 | 46.01 | 48.17 | 0.0001 | |
| MMETSP0142_P.australis_10249 | 1.00 | 58,578 | 74,556 | 133,134 | 59.37 | 47.40 | 0.0001 | |
| MMETSP0327_P.delicatissima_B596 | 0.85 | 34,900 | 25,100 | 60,000 | 45.42 | 60.73 | 0.0001 | |
| MMETSP0329_P.arenysensis_B593 | 0.80 | 46,274 | 50,697 | 96,971 | 55.36 | 48.87 | 0.0001 | |
| MMETSP0853_P.fraudulenta_WWA7 | 1.00 | 46,621 | 29,852 | 76,473 | 44.89 | 62.85 | 0.0001 | |
| MMETSP1060_P.pungens_cingulata | 1.00 | 58,723 | 16,794 | 75,517 | 25.02 | 83.10 | 0.0001 | |
| MMETSP1061_P.pungens_pungens | 1.00 | 58,414 | 16,385 | 74,799 | 24.72 | 83.46 | 0.0001 | |
| MMETSP1423_P.heimii_UNC1101 | 1.00 | 76,249 | 75,362 | 151,611 | 50.88 | 52.37 | 0.0013 | |
| MMETSP1432_P.delicatissima_UNC1205 | 0.87 | 23,380 | 27,853 | 51,233 | 46.63 | 60.20 | 0.0001 | |
| MMETSP0563_S.dohrnii_SkelB | 0.64 | 493 | 463 | 956 | 44.56 | 49.27 | 0.0185 | |
| MMETSP0578_S.grethae_CCMP1804 | 0.67 | 1704 | 861 | 2565 | 26.82 | 65.50 | 0.0001 | |
| MMETSP0593_S.japonicum_CCMP2506 | 0.64 | 1466 | 688 | 2154 | 24.56 | 72.93 | 0.0001 | |
| MMETSP0604_S.menzellii_CCMP793 | 0.64 | 2935 | 2354 | 5289 | 41.96 | 51.35 | 0.0001 | |
| MMETSP1040_S.marinoi_FE60 | 0.64 | 1220 | 907 | 2127 | 38.81 | 61.64 | 0.0001 | |
| MMETSP1428_S.marinoi_UNC1201 | 0.64 | 2122 | 1728 | 3850 | 40.44 | 54.10 | 0.0001 | |
| MMETSP0404_T.rotula_CCMP3096 | 1.00 | 4506 | 2778 | 7284 | 58.83 | 61.86 | 0.0001 | |
| MMETSP0494_T.gravida_GMp14c1 | 1.00 | 3225 | 3036 | 6261 | 64.29 | 54.72 | 0.0008 | |
| MMETSP0740_T.minuscula_CCMP1093 | 0.96 | 6697 | 7244 | 13,941 | 49.11 | 55.70 | 0.0001 | |
| MMETSP0905_T.antarctica_CCMP982 | 0.97 | 5058 | 8047 | 13,105 | 70.67 | 38.99 | 0.0001 | |
| MMETSP0973_T.oceanica_CCMP1005 | 1.00 | 17,637 | 11,133 | 28,770 | 50.84 | 63.00 | 0.0001 | |
| MMETSP1067_T.punctigera_Tpunct2005C2 | 1.00 | 13,348 | 5128 | 18476 | 39.15 | 73.28 | 0.0001 | |
| MMETSP1071_T.NH16 | 1.00 | 10,106 | 7236 | 17,342 | 52.98 | 59.81 | 0.0001 | |
| MMETSP0881_T.weissflogii_CCMP1336 | 1.00 | 10,543 | 10,658 | 21,201 | 55.92 | 51.45 | 0.2682 | |
| MMETSP1059_T. FW | 0.68 | 11,369 | 10,174 | 21,543 | 65.27 | 38.67 | 0.0001 |
The numbers of substitutions are counted for the external branches of the phylogenies shown in Figure 3. To avoid having too few data because of short species branches (that counted only a few hundred substitutions), we excluded some strains to have only one individual per species, particularly in Skeletonema and Thalaisiosira. Total substitutions is the total synonymous substitutions in the species branch; GC ancestral is the ancestral GC content of the third position before the substitution from the branch leading to the species; GC current is the current GC content of the third position of Total substitutions sites; 2 × 2 χ2 tests significance of difference in numbers of To preferred and to To unpreferred according to the Table S2.