| Literature DB >> 27517915 |
Jibin Liu1,2, Dekang Zhu3, Guangpeng Ma4, Mafeng Liu5,6, Mingshu Wang7,8, Renyong Jia9,10, Shun Chen11,12, Kunfeng Sun13,14, Qiao Yang15,16, Ying Wu17,18, Xiaoyue Chen19, Anchun Cheng20,21.
Abstract
Riemerella anatipestifer (RA) belongs to the Flavobacteriaceae family and can cause a septicemia disease in poultry. The synonymous codon usage patterns of bacteria reflect a series of evolutionary changes that enable bacteria to improve tolerance of the various environments. We detailed the codon usage patterns of RA isolates from the available 12 sequenced genomes by multiple codon and statistical analysis. Nucleotide compositions and relative synonymous codon usage (RSCU) analysis revealed that A or U ending codons are predominant in RA. Neutrality analysis found no significant correlation between GC12 and GC₃ (p > 0.05). Correspondence analysis and ENc-plot results showed that natural selection dominated over mutation in the codon usage bias. The tree of cluster analysis based on RSCU was concordant with dendrogram based on genomic BLAST by neighbor-joining method. By comparative analysis, about 50 highly expressed genes that were orthologs across all 12 strains were found in the top 5% of high CAI value. Based on these CAI values, we infer that RA contains a number of predicted highly expressed coding sequences, involved in transcriptional regulation and metabolism, reflecting their requirement for dealing with diverse environmental conditions. These results provide some useful information on the mechanisms that contribute to codon usage bias and evolution of RA.Entities:
Keywords: Riemerella anatipestifer; codon usage bias; highly expressed gene; natural selection
Mesh:
Substances:
Year: 2016 PMID: 27517915 PMCID: PMC5000701 DOI: 10.3390/ijms17081304
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Comparison of RSCU between 12 different species of RA. The heat-map was drawn by HemI using hierarchical clustering method. The higher RSCU value, suggesting more frequent codon usage, was represented with darker shades of red. In Riemerella anatipestifer (RA) genomes, codons ending in A or U have higher RSCU value than codons ending in G or C.
The RSCU analysis of the preferred codons, the optimal codons and the rare codons for RA.
| Amino Acids | Codon | RSCU 1 | Amino Acids | Codon | RSCU 1 |
|---|---|---|---|---|---|
| Ala | GCG | 0.49 | Pro | CCC | 0.31 |
| GCC | 0.61 | CCG | 0.33 | ||
| GCU * | CCA * | ||||
| GCA * | CCU * | ||||
| Cys | UGC | 0.57 | Arg | AGG | 0.37 |
| UGU * | CGC | 0.71 | |||
| Asp | GAC | 0.55 | CGG | 0.19 | |
| GAU * | AGA * | ||||
| Glu | GAG | 0.60 | CGA | ||
| GAA | CGU * | ||||
| Phe | UUC | 0.47 | Ser | AGC | 0.68 |
| UUU * | UCC | 0.46 | |||
| Gly | GGG | 0.66 | UCG | 0.49 | |
| GGC | 0.56 | AGU * | |||
| GGU * | UCA | 0.80 | |||
| GGA * | UCU * | ||||
| His | CAC | 0.80 | Thr | ACC | 0.89 |
| CAU * | ACG | 0.50 | |||
| Ile | AUC | 0.55 | ACA * | ||
| AUU * | ACU * | ||||
| AUA | Val | GUG | 0.82 | ||
| Lys | AAG | 0.47 | GUC | 0.18 | |
| AAA * | GUU * | ||||
| Leu | CUC | 0.51 | GUA * | ||
| CUG | 0.39 | Try | UAC | 0.62 | |
| UUG | 0.68 | UAU * | |||
| CUA * | Gln | CAG | 0.57 | ||
| CUU * | CAA * | ||||
| UUA * | Stop | UGA | 0.20 | ||
| Asn | AAC | 0.66 | UAG | 0.61 | |
| AAU * | UAA * | ||||
| Met | AUG | 1.00 |
1 Average value of RSCU in 12 RA genomes; * represents the optimal codons (p-value < 0.01). The preferred codons (RSCU > 1) are in bold.
Characteristic in the indices of codon bias of RA genes.
| RA Strain | GC% | GC3s% | ENc | CAI |
|---|---|---|---|---|
| ATCC11845 | 35.42 ± 3.63 | 26.50 ± 5.75 | 45.04 ± 5.13 | 0.616 ± 0.062 |
| RA-CH-1 | 35.51 ± 3.85 | 27.05 ± 6.28 | 45.47 ± 5.15 | 0.604 ± 0.064 |
| CH3 | 35.59 ± 3.89 | 27.07 ± 6.30 | 45.46 ± 5.17 | 0.582 ± 0.070 |
| RA-CH-2 | 35.39 ± 3.62 | 26.64 ± 5.98 | 45.19 ± 5.19 | 0.616 ± 0.064 |
| RA-GD | 35.33 ± 3.64 | 26.67 ± 5.95 | 45.16 ± 5.20 | 0.602 ± 0.069 |
| RA-SG | 35.40 ± 3.60 | 26.52 ± 5.17 | 45.10 ± 5.07 | 0.609 ± 0.063 |
| RA-YM | 35.50 ± 3.58 | 26.69 ± 5.80 | 45.15 ± 5.09 | 0.610 ± 0.063 |
| Yb2 | 35.34 ± 3.64 | 26.50 ± 5.74 | 45.12 ± 5.14 | 0.613 ± 0.062 |
| RA-JLLY | 35.45 ± 3.89 | 26.91 ± 6.25 | 45.34 ± 5.21 | 0.605 ± 0.067 |
| RA153 | 35.44 ± 3.61 | 26.57 ± 5.86 | 45.18 ± 5.18 | 0.604 ± 0.064 |
| RA17 | 35.52 ± 3.59 | 26.55 ± 5.77 | 45.14 ± 5.22 | 0.690 ± 0.064 |
| RCAD0122 | 35.40 ± 3.65 | 26.69 ± 5.86 | 45.21 ± 5.14 | 0.607 ± 0.064 |
Figure 2The ENc vs. GC3s plots of RA genomes. (A) ATTCC11845; (B) RA-CH-1; (C) CH3; (D) RA-CH-2; (E) RA-GD; (F) RA-SG; (G) RA-YM; (H) Yb2; (I) RA-JLLY; (J) RA153; (K) RA17; and (L) RCAD0122. The standard curve represents the expected ENc to GC3s. Most RA genes are far away from the standard curve, showing that their codon usage pattern might be affected by other factors besides nucleotide composition. Some genes with the ENc score of 61 display no bias and use all the 61 sense codons.
Figure 3The correspondence analysis (COA) of the genes in RA genomes. (A) ATTCC11845; (B) RA-CH-1; (C) CH3; (D) RA-CH-2; (E) RA-GD; (F) RA-SG; (G) RA-YM; (H) Yb2; (I) RA-JLLY; (J) RA153; (K) RA17; and (L) RCAD0122. Each point represents a gene corresponding to the coordinates of the first and second axes of variation generated from the correspondence analysis.
Figure 4Neutrality plots of RA genomes. (A) ATTCC11845; (B) RA-CH-1; (C) CH3; (D) RA-CH-2; (E) RA-GD; (F) RA-SG; (G) RA-YM; (H) Yb2; (I) RA-JLLY; (J) RA153; (K) RA17; and (L) RCAD0122. Individual genes are plotted based on the mean GC content in the first and second codon position (P12) versus the GC content of the third codon position (P3). Regression lines and Spearman’s rank correlation coefficients (ρ) are shown, with the asterisk (*) denoting p-values < 0.01.
Figure 5Comparison of phylogenetic tree with RSCU based clustering of RA strains. (A) The phylogenetic tree derived for the RA genome using genomic BLAST by neighbor-joining method; (B) Cluster analysis of the 12 species in RA based on RSCU value. The observed distances range from 1 to 25, the ratio of the rescaled distances within the dendrogram is the same as the ratio of the original Squared Euclidean distances.
Figure 6Frequency distribution of codon adaptation index (CAI) values for coding sequence (CDS) in the genomes of RA strains.
Orthologous high-level expression genes found in 12 RA strains.
| Category | Gene | Proteins | Strains |
|---|---|---|---|
| Ribosome | Large subunit ribosomal protein L1 | RA-CH-1, RA-GD, RA17 | |
| Large subunit ribosomal protein L2 | + | ||
| Large subunit ribosomal protein L4 | + | ||
| Large subunit ribosomal protein L5 | + | ||
| Large subunit ribosomal protein L6 | + | ||
| Large subunit ribosomal protein L7/L12 | + | ||
| Large subunit ribosomal protein L9 | + | ||
| Large subunit ribosomal protein L10 | + | ||
| Large subunit ribosomal protein L11 | RA-CH-1 | ||
| Large subunit ribosomal protein L14 | + | ||
| Large subunit ribosomal protein L15 | + | ||
| Large subunit ribosomal protein L16 | RA-CH-2, CH3, ATCC11845 | ||
| Large subunit ribosomal protein L17 | + | ||
| Large subunit ribosomal protein L18 | + | ||
| Large subunit ribosomal protein L19 | + | ||
| Large subunit ribosomal protein L21 | + | ||
| Large subunit ribosomal protein L22 | + | ||
| Large subunit ribosomal protein L24 | + | ||
| Small subunit ribosomal protein S1 | + | ||
| Small subunit ribosomal protein S2 | + | ||
| Ribosome | Small subunit ribosomal protein S3 | + | |
| Small subunit ribosomal protein S4 | + | ||
| Small subunit ribosomal protein S5 | RA-CH-1, CH3 | ||
| Small subunit ribosomal protein S7 | + | ||
| Small subunit ribosomal protein S8 | Except RA17, RA-GD | ||
| Small subunit ribosomal protein S9 | + | ||
| Small subunit ribosomal protein S11 | + | ||
| Small subunit ribosomal protein S15 | + | ||
| Small subunit ribosomal protein S16 | CH3 | ||
| Small subunit ribosomal protein S18 | + | ||
| Elongation factor | Elongation factor Tu | + | |
| Elongation factor G | + | ||
| Elongation factor Ts | + | ||
| Chaperone | Molecular chaperone DnaK | Except CH3 | |
| Chaperonin GroEL | + | ||
| Trigger factor | + | ||
| Enzymes | Aconitate hydratase | + | |
| Succinyl-CoA synthetase β subunit | + | ||
| Succinyl-CoA synthetase α subunit | + | ||
| Malate dehydrogenase | + | ||
| Glyceraldehyde 3-phosphate dehydrogenase | + | ||
| Cytochrome c oxidase cbb3-type subunit III | + | ||
| Cytochrome c peroxidase | + | ||
| F-type H+-transporting ATPase subunit α | RA-CH-1, RA-CH-2, RA17, ATCC11845 | ||
| F-type H+-transporting ATPase subunit b | Except CH3 | ||
| Nicotinamidase/pyrazinamidase | + | ||
| Nucleoside-diphosphate kinase | + | ||
| Alkyl hydroperoxide reductase/thiol specific antioxidant/mal allergen | + | ||
| Peptidyl-prolyl isomerase | + | ||
| Molybdopterin-containing oxidoreductase | RA-CH-1 | ||
| Catalase | RA153, ATCC11845, Yb2, RA-SG | ||
| Peroxiredoxin | Except RA153, RA17 | ||
| Succinate dehydrogenase/fumarate reductase | RA153, RA17, RA-SG, RA-YM | ||
| Enolase | + | ||
| Enzymes | Nitrite reductase | RA-CH-2, RA153, RA17, RA-GD | |
| DNA adenine methylase | RA17, ATCC11845, RA-GD | ||
| TatD DNase family protein | RA-CH-1 | ||
| 4-Amino-4-deoxychorismate lyase | RA-JLLY | ||
| Alanine dehydrogenase | RA-JLLY | ||
| 3,4-Dihydroxy 2-butanone 4-phosphate synthase | RA-JLLY | ||
| - | Peptidase s8 and s53 subtilisin kexin sedolisin | + | |
| - | Peptidase s46 | RA17 | |
| - | Putative FAD dependent oxidoreductase | RA17 | |
| - | Septum formation initiator | RA17 | |
| - | Serine protease | ATCC11845 | |
| - | Nodulation protein X acyltransferase 3 | ATCC11845 | |
| Binding protein | - | Cyclic nucleotide-binding protein | Except RA17 |
| Transport protein | Transcriptional regulator | RA17 | |
| Apoptosis protein | Cytochrome c | Except RA17, RA-JLLY | |
| Structure protein | Gliding motility protein gldl | RA17, ATCC11845 | |
| Outer membrane protein | + | ||
| ompa/motb domain-containing protein | + | ||
| Ferritin | RA153, ATCC11845, Yb2, RCAD0122 | ||
| Histidine triad (HIT) family protein | CH3 | ||
| - | Phosphate-selective porin o and p protein | RA-CH-1 |
+ represents the gene found in all RA strains.
RA strains used in this study.
| Strain | Serotype | Geographic Location | Accession No. | CDS | CDS (>300 bp) | Reference |
|---|---|---|---|---|---|---|
| ATCC11845 | 6 | USA | CP003388. | 1941 | 1764 | [ |
| RA-CH-1 | 1 | Sichuan | CP003787 | 2187 | 1953 | |
| CH3 | 1 | Jiangsu | CP006649 | 2181 | 1916 | [ |
| RA-YM | 1 | Hubei | AENH00000000 | 2010 | 1796 | [ |
| RA-CH-2 | 2 | Sichuan | CP004020 | 2044 | 1844 | [ |
| RA-GD | 2 | Guangdong | CP002562 | 1985 | 1815 | [ |
| Yb2 | 2 | Jiangsu | CP007204 | 2021 | 1877 | [ |
| RA153 | 2 | Fujian | CP007504 | 1919 | 1730 | |
| RA17 | ND 1 | Fujian | CP007503 | 1656 | 1613 | |
| RA-SG | ND 1 | Guangdong | ANGF00000000 | 2066 | 1838 | [ |
| RA-JLLY | ND 1 | Hubei | LAVB01000000 | 2089 | 1858 | [ |
| RCAD0122 | ND 1 | Guangdong | LUDU00000000 | 2149 | 1892 | [ |
1 ND: Not determined.
Figure 7Geographical location of the RA analyzed in this study. The main distribution of duck industries in China are shown in dark gray. The provinces (regions) of the RA strains in this study are indicated in red.