| Literature DB >> 35725374 |
Yue Gao1, Yan Lu1, Yang Song1, Lan Jing2.
Abstract
BACKGROUND: The phenomenon of codon usage bias is known to exist in many genomes and is mainly determined by mutation and selection. Codon usage bias analysis is a suitable strategy for identifying the principal evolutionary driving forces in different organisms. Sunflower (Helianthus annuus L.) is an annual crop that is cultivated worldwide as ornamentals, food plants and for their valuable oil. The WRKY family genes in plants play a central role in diverse regulation and multiple stress responses. Evolutionary analysis of WRKY family genes of H. annuus can provide rich genetic information for developing hybridization resources of the genus Helianthus.Entities:
Keywords: Evolutionary forces; Helianthus annuus; Synonymous codon usage bias; WRKY transcription factors
Mesh:
Substances:
Year: 2022 PMID: 35725374 PMCID: PMC9210703 DOI: 10.1186/s12863-022-01064-8
Source DB: PubMed Journal: BMC Genom Data ISSN: 2730-6844
Correlation coefficients of the indices influencing codon bias in HaWRKY genome
| Indices | T3s | C3s | A3s | G3s | GC3s | GC1 | GC2 | GC3 | GC | CAI | ENC |
|---|---|---|---|---|---|---|---|---|---|---|---|
| T3s | 1.000 | ||||||||||
| C3s | –0.601** | 1.000 | |||||||||
| A3s | 0.058 | –0.046 | 1.000 | ||||||||
| G3s | –0.333** | –0.237* | –0.646** | 1.000 | |||||||
| GC3s | –0.787** | 0.550** | –0.623** | 0.657** | 1.000 | ||||||
| GC1 | –0.132 | –0.058 | –0.267** | 0.175 | 0.196* | 1.000 | |||||
| GC2 | –0.569** | 0.327** | –0.436** | 0.248** | 0.594** | 0.316** | 1.000 | ||||
| GC3 | –0.773** | 0.569** | –0.611** | 0.631** | 0.987** | 0.192* | 0.570** | 1.000 | |||
| GC | –0.667** | 0.388** | –0.587** | 0.475** | 0.804** | 0.633** | 0.838** | 0.798** | 1.000 | ||
| CAI | 0.138 | 0.283** | –0.208* | –0.210* | 0.044 | 0.148 | 0.121 | 0.033 | 0.129 | 1.000 | |
| ENC | –0.350** | 0.328** | –0.070 | 0.119 | 0.332** | 0.276** | 0.207* | 0.331** | 0.358** | –0.032 | 1.000 |
Note: * P value < 0.05; ** P value < 0.01
Codon usage and high frequency used codons in HaWRKY genome
| Amino acid | Codon | Frequency | Number | RSCU | Amino acid | Codon | Frequency | Number | RSCU |
|---|---|---|---|---|---|---|---|---|---|
| Ala (A) | Pro (P) | ||||||||
| GCC | 8.32 | 343 | 0.81 | CCC | 8.20 | 338 | 0.55 | ||
| GCG | 7.03 | 290 | 0.69 | ||||||
| CCU | 14.07 | 580 | 0.94 | ||||||
| Cys (C) | UGC | 8.46 | 349 | 0.92 | Gln (Q) | ||||
| CAG | 14.84 | 612 | 0.61 | ||||||
| Asp (D) | GAC | 12.17 | 502 | 0.56 | Arg (R) | ||||
| Glu (E) | CGA | 8.68 | 358 | 0.82 | |||||
| GAG | 17.02 | 702 | 0.76 | CGC | 4.51 | 186 | 0.43 | ||
| Phe ( F) | UUC | 13.29 | 548 | 0.82 | CGG | 9.56 | 394 | 0.90 | |
| CGU | 6.84 | 282 | 0.64 | ||||||
| Gly (G) | Ser (S) | AGC | 12.47 | 514 | 0.78 | ||||
| GGC | 7.15 | 295 | 0.58 | ||||||
| GGG | 10.67 | 440 | 0.86 | ||||||
| UCC | 11.11 | 458 | 0.70 | ||||||
| His (H) | CAC | 15.30 | 631 | 0.86 | UCG | 12.15 | 501 | 0.76 | |
| Ile (I) | AUA | 15.76 | 650 | 0.91 | Thr (T) | ||||
| ACG | 11.47 | 473 | 0.67 | ||||||
| Lys (K) | ACU | 15.28 | 630 | 0.89 | |||||
| AAG | 29.66 | 1223 | 0.90 | Val (V) | GUA | 10.02 | 413 | 0.73 | |
| Leu (L) | CUA | 13.24 | 546 | 0.92 | GUC | 8.95 | 369 | 0.65 | |
| CUC | 12.27 | 506 | 0.85 | ||||||
| CUG | 12.34 | 509 | 0.85 | ||||||
| Trp (W) | UGG | 13.56 | 559 | 1 | |||||
| UUA | 14.02 | 578 | 0.97 | Tyr (Y) | UAC | 11.47 | 473 | 0.90 | |
| Met (M) | AUG | 30.22 | 1245 | 1 | Terminator | UAA | 5.53 | 228 | |
| Asn (N) | AAC | 24.91 | 1027 | 0.97 | UAG | 5.09 | 210 | ||
Note: The highest frequency used codons (RSCU value > 1) are in bold. RSCU, the relative synonymous codon usage value
The codons statistics with high and low expression genes of HaWRKY genome
| Amino acid | Codon | High expressed gene | Low expressed gene | ΔRSCU | ||
|---|---|---|---|---|---|---|
| Ala (A) | GCU | 24 | 1.68 | 39 | 1.25 | 0.43 |
| GCC | 11 | 0.77 | 36 | 1.15 | -0.38 | |
| GCG | 2 | 0.14 | 19 | 0.61 | -0.47 | |
| Cys (C) | UGU | 24 | 1.37 | 21 | 1.2 | 0.17 |
| UGC | 11 | 0.63 | 14 | 0.8 | -0.17 | |
| Asp (D) | GAU | 92 | 1.63 | 83 | 1.18 | 0.45 |
| GAC | 21 | 0.37 | 58 | 0.82 | -0.45 | |
| Glu (E) | GAA | 77 | 1.34 | 82 | 1.13 | 0.21 |
| GAG | 38 | 0.66 | 63 | 0.87 | -0.21 | |
| Phe (F) | UUU | 40 | 1.29 | 43 | 1.19 | 0.1 |
| UUC | 22 | 0.71 | 29 | 0.81 | -0.1 | |
| Gly (G) | GGU | 32 | 1.66 | 46 | 1.45 | 0.21 |
| GGC | 7 | 0.36 | 20 | 0.63 | -0.27 | |
| GGA | 24 | 1.25 | 37 | 1.17 | 0.08 | |
| GGG | 14 | 0.73 | 24 | 0.76 | -0.03 | |
| His (H) | CAU | 36 | 1.6 | 43 | 1.13 | 0.47 |
| CAC | 9 | 0.4 | 33 | 0.87 | -0.47 | |
| Ile (I) | AUU | 29 | 1.21 | 33 | 1.09 | 0.12 |
| AUC | 28 | 1.17 | 32 | 1.05 | 0.12 | |
| AUA | 15 | 0.63 | 26 | 0.86 | -0.23 | |
| Lys (K) | AAA | 72 | 1.04 | 90 | 1.15 | -0.11 |
| AAG | 67 | 0.96 | 66 | 0.85 | 0.11 | |
| Leu (L) | UUA | 27 | 1.4 | 25 | 1.15 | 0.25 |
| UUG | 23 | 1.19 | 30 | 1.38 | -0.19 | |
| CUU | 31 | 1.6 | 27 | 1.25 | 0.35 | |
| CUC | 10 | 0.52 | 19 | 0.88 | -0.36 | |
| CUA | 20 | 1.03 | 23 | 1.06 | -0.03 | |
| CUG | 5 | 0.26 | 6 | 0.28 | -0.02 | |
| Met (M) | AUG | 46 | 1 | 53 | 1 | 0 |
| Asn (N) | AAU | 52 | 1.21 | 65 | 1.01 | 0.2 |
| AAC | 34 | 0.79 | 64 | 0.99 | -0.2 | |
| Pro (P) | CCU | 34 | 1.27 | 44 | 1.06 | 0.21 |
| CCC | 14 | 0.52 | 27 | 0.65 | -0.13 | |
| CCA | 45 | 1.68 | 54 | 1.3 | 0.38 | |
| CCG | 14 | 0.52 | 41 | 0.99 | -0.47 | |
| Gln (Q) | CAA | 69 | 1.57 | 88 | 1.49 | 0.08 |
| CAG | 19 | 0.43 | 30 | 0.51 | -0.08 | |
| Arg (R) | AGA | 52 | 4 | 26 | 1.33 | 2.67 |
| AGG | 13 | 1 | 30 | 1.54 | -0.54 | |
| CGU | 3 | 0.23 | 15 | 0.77 | -0.54 | |
| CGC | 0 | 0 | 7 | 0.36 | -0.36 | |
| CGA | 6 | 0.46 | 21 | 1.08 | -0.62 | |
| CGG | 4 | 0.31 | 18 | 0.92 | -0.61 | |
| Ser (S) | ||||||
| AGC | 18 | 0.58 | 31 | 0.68 | -0.1 | |
| UCU | 39 | 1.25 | 68 | 1.49 | -0.24 | |
| UCC | 20 | 0.64 | 40 | 0.88 | -0.24 | |
| UCA | 39 | 1.25 | 68 | 1.49 | -0.24 | |
| UCG | 25 | 0.8 | 32 | 0.7 | 0.1 | |
| Thr (T) | ||||||
| ACC | 26 | 1.05 | 56 | 1.29 | -0.24 | |
| ACA | 34 | 1.37 | 44 | 1.02 | 0.35 | |
| ACG | 5 | 0.2 | 30 | 0.69 | -0.49 | |
| Val (V) | GUU | 37 | 1.41 | 44 | 1.43 | -0.02 |
| GUC | 14 | 0.53 | 22 | 0.72 | -0.19 | |
| GUA | 23 | 0.88 | 17 | 0.55 | 0.33 | |
| GUG | 31 | 1.18 | 40 | 1.3 | -0.12 | |
| Trp (W) | UGG | 16 | 1 | 21 | 1 | 0 |
| Tyr (Y) | UAU | 34 | 1.24 | 39 | 1.07 | 0.17 |
| UAC | 21 | 0.76 | 34 | 0.93 | -0.17 | |
| Terminator | UAA | 5 | 2.5 | 3 | 1.29 | 1.21 |
| UAG | 0 | 0 | 2 | 0.86 | -0.86 | |
| UGA | 1 | 0.5 | 2 | 0.86 | -0.36 | |
Note: Optimal codons (ΔRSCU ≥ 0.3, with RSCU > 1 in high-bias genes, RSCU < 1 in low-bias genes) are in bold
Fig. 1Neutrality plot analysis in HaWRKY genome. Note: GC12, the average G + C content at the first and second codon positions; GC3, the G + C content at the third codon positions
Fig. 2ENC-plot analysis in HaWRKY genome. The solid curve represents the expected positions of genes when the codon usage was only determined by the GC3s composition. Note: ENC, effective number of codons; GC3s, the G + C content at the third position of synonymous codons
Fig. 3Frequency distribution of (ENCexp-ENCobs)/ENCexp, ENCexp represents expected ENC values and ENCobs represents observed ENC values
Fig. 4Analysis of PR2-plot in HaWRKY genome. The mean value of A3/(A3 + T3) is 0.4678, and that of G3/(G3 + C3) is 0.5033. The curves show the center line on 0.5. Note: A3/(A3 + T3), the ratio of A against A + T at the third position of codons; G3/(G3 + C3), the ratio of G against G + C at the third position of codons