| Literature DB >> 33538926 |
Xianglong Yu1, Jianxin Liu1, Huizi Li1, Boyang Liu1, Bingqian Zhao1, Zhangyong Ning2.
Abstract
Atypical porcine pestivirus (APPV) is an emerging novel pestivirus causing the congenital tremor (CT) in piglets. The worldwide distribution characteristic of APPV make it a threat to global swine health. E2 is the major envelope glycoprotein of APPV and the crucial target for vaccine development. Considering the genetic variability of APPV complete genomes and its E2 gene as well as gaps for codon analysis, a comprehensive analysis of codon usage patterns was performed. Relative synonymous codon usage (RSCU) and effective number of codon (ENC) analyses showed that a relatively instable change existed and a slight low codon usage bias (CUB) were displayed in APPV genomes. ENC-plot analysis and correlation analyses of nucleotide compositions and ENC showed that mutation pressure and natural selection both affected the codon usage bias of the APPV and natural selection had a more obvious influence for E2 gene compared with complete genomes. Principal component analysis (PCA) and correlation analyses confirmed the above results. Correlation analyses between Gravy and Aromaticity values and the codon bias showed that natural selection played an important role in shaping the synonymous codon bias. Furthermore, neutrality plot analysis showed that natural selection was the main force while mutation pressure was a minor force influencing the codon usage pattern of the APPV E2 gene and complete genomes. The results could illustrate the codon usage patterns of APPV genomes and provided valuable basic data for further fundamental research of evolution of APPV.Entities:
Keywords: APPV; Codon usage bias; E2; Mutation pressure; Natural selection
Mesh:
Year: 2021 PMID: 33538926 PMCID: PMC7860996 DOI: 10.1007/s10528-021-10037-y
Source DB: PubMed Journal: Biochem Genet ISSN: 0006-2928 Impact factor: 1.890
Overall RSCU of collected sequences of the APPV E2 gene and complete genomes
| Amino acid | Codon | RSCU/E2 gene | RSCU/APPV complete genomes | RSCU/swine | Amino acid | Codon | RSCU/E2 gene | RSCU/APPV complete genomes | RSCU/swine |
|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU | 0.98 | 0.95 | 0.7868 | Ser | AGC | 1.58 | 1.16 | |
| UUC | AGU | 0.78 | 1.13 | 0.7713 | |||||
| Leu | CUA | 1.19 | 0.3311 | UCA | 0.7226 | ||||
| CUC | 0.79 | 0.68 | 1.3475 | UCC | 0.67 | 1.05 | 1.5021 | ||
| CUG | 1.42 | UCG | 0.63 | 0.38 | 0.3897 | ||||
| CUU | 0.61 | 0.61 | 0.6506 | UCU | 0.72 | 0.96 | 0.9905 | ||
| UUA | 0.76 | 1.08 | 0.3195 | Cys | UGC | ||||
| UUG | 0.91 | 1.06 | 0.6738 | UGU | 0.55 | 0.94 | 0.7848 | ||
| Tyr | UAC | 0.98 | Pro | CCA | 0.9423 | ||||
| UAU | 0.75 | 0.7315 | CCC | 0.96 | 0.87 | ||||
| His | CAC | 0.86 | 0.93 | CCG | 0.76 | 0.60 | 0.5536 | ||
| CAU | 0.7054 | CCU | 0.72 | 1.05 | 1.0478 | ||||
| Gln | CAA | 0.81 | 0.441 | Arg | AGA | 1.1155 | |||
| CAG | 0.97 | AGG | 1.73 | 2.23 | 1.2347 | ||||
| Ile | AUA | 0.4208 | CGA | 0.45 | 0.41 | 0.6065 | |||
| AUC | 0.28 | 0.89 | CGC | 0.26 | 0.28 | ||||
| AUU | 0.52 | 0.71 | 0.9095 | CGG | 0.60 | 0.48 | 1.2888 | ||
| Asn | AAC | 0.62 | 0.97 | CGU | 0.46 | 0.32 | 0.444 | ||
| AAU | 0.7867 | Thr | ACA | 1.32 | 1.13 | 0.9145 | |||
| Lys | AAA | 0.7603 | ACC | 0.96 | |||||
| AAG | 0.76 | 0.84 | ACG | 0.25 | 0.42 | 0.5725 | |||
| Val | GUA | 0.89 | 1.05 | 0.3385 | ACU | 1.16 | 0.8327 | ||
| GUC | 0.59 | 1.16 | 1.0646 | Ala | GCA | 0.7386 | |||
| GUG | GCC | 1.12 | 1.33 | ||||||
| GUU | 0.39 | 0.45 | 0.5661 | GCG | 0.47 | 0.34 | 0.5057 | ||
| Asp | GAC | GCU | 0.71 | 0.97 | 0.9546 | ||||
| GAU | 0.48 | 0.79 | 0.8025 | Gly | GGA | 0.20 | 0.90 | 0.9117 | |
| Glu | GAA | 0.7256 | GGC | 1.58 | 0.91 | ||||
| GAG | 0.44 | 0.79 | GGG | 0.60 | 1.0541 | ||||
| GGU | 0.81 | 0.5698 | |||||||
RSCU values of the preferred codon for APPV complete genomes, E2 gene and swine are in bold italics
Fig. 1The relationship between the ENC values and GC3s. a ENC plots for E2 gene showing the relationship between the ENC values and GC3s. The result showed that most of the points were lower than the standard curve, which indicates mutational pressure and other factors both influenced the codon usage bias. b ENC plots for APPV complete genomes. The result showed that all the points were lower than the standard curve. The larger version was indicated by the arrow
correlation analyses of nucleotide compositions and ENC
| A3s | C3s | G3s | U3s | GC3s | ENC | |
|---|---|---|---|---|---|---|
| APPV E2 gene | ||||||
| A | 0.6461** | 0.0714 | − 0.5759** | − 0.1172** | − 0.339** | − 0.2480** |
| C | − 0.0889** | 0.8625** | 0.0299** | − 0.7574** | 0.6941** | − 0.2337** |
| G | − 0.5992** | − 0.0304** | 0.7123** | − 0.0513** | 0.4425** | 0.3048** |
| U | 0.0810** | − 0.8248** | − 0.1977** | 0.8473** | − 0.7580** | 0.1465** |
| GC | − 0.5571** | 0.6176** | 0.5981** | − 0.6006** | 0.8749** | 0.0574** |
| APPV complete genomes | ||||||
| A | 0.9301** | − 0.5051** | − 0.8099** | 0.2740** | − 0.7504** | − 0.2900** |
| C | − 0.3246** | 0.7793** | 0.2978** | − 0.5237** | 0.5834** | − 0.1982** |
| G | − 0.7759** | 0.5740** | 0.9318** | − 0.5500** | 0.8696** | 0.1049** |
| U | 0.1622** | − 0.7779** | − 0.4608** | 0.7909** | − 0.6926** | 0.3658** |
| GC | − 0.6980** | 0.7904** | 0.7920** | − 0.6419** | 0.8930** | − 0.0258** |
The numbers represent correlation coefficient “r” values (**P < 0.01)
Fig. 2Principal component analysis. a, b The distributions of the first 20 vectors by PCA for APPV E2 gene and complete genomes, respectively. Columns represent the relative inertia and the curve represents the cumulative inertia. c, d The PCA plot 1st axis against 2nd axis for E2 gene and complete genomes. Regarding E2 gene, the first axis accounted for 23.23% of the data inertia and the second axis accounted for 17.86%. Regarding APPV complete genomes, the first axis accounted for 34.67% and the second axis accounted for 19.69%. Different countries are represented by different colors
Summary of correlation between the first two axes and nucleotide constraints
| APPV E2 gene | APPV complete genomes | |||
|---|---|---|---|---|
| Axis 1 | Axis 2 | Axis 1 | Axis 2 | |
| A | − 0.0030* | − 0.0153* | − 0.3169* | − 0.6365* |
| C | − 0.3051 | 0.3043 | 0.1824 | 0.2489 |
| G | 0.0528* | − 0.0689* | 0.5853 | 0.4625 |
| U | 0.2357 | − 0.1947 | − 0.4854 | − 0.0467 |
| A3s | 0.4078** | 0.3526** | − 0.1074** | − 0.6649** |
| C3s | − 0.0234* | 0.3513* | 0.4701* | 0.1608* |
| G3s | − 0.3324* | − 0.1545* | 0.6694* | 0.3435* |
| U3s | − 0.0750* | − 0.4570* | − 0.7958* | 0.1549* |
| GC | − 0.1966** | 0.1720** | 0.4907** | 0.4449** |
| GC3s | − 0.2168** | 0.1853** | 0.6686** | 0.2729** |
| ENC | 0.7757** | − 0.1915** | 0.0179** | 0.3719** |
The numbers represent correlation coefficient “r” values (*P < 0.05, **P < 0.01)
Correlation analyses among AROMO, GRAVY, the first two axes, ENC, GC3s and GC
| Axis1 | Axis2 | ENC | GC3s | GC | |
|---|---|---|---|---|---|
| APPV E2 gene | |||||
| Gravy | 0.6174** | 0.1327** | 0.6185** | 0.1908** | 0.2003** |
| Aromo | 0.0551 | 0.4896 | − 0.1358** | − 0.1245** | − 0.2502** |
| APPV complete genomes | |||||
| Gravy | 0.2361 | − 0.0101 | 0.1027** | 0.1650** | − 0.0174** |
| Aromo | 0.3235 | 0.0396 | 0.0616** | 0.2976** | 0.1402** |
The numbers represent correlation coefficient “r” values (**P < 0.01)
Fig. 3Neutrality analysis with GC3s plotted against GC12s. a, b The Neutrality analysis for E2 gene and complete genomes, respectively. The regression line was represented by the straight line and the regression equation was showed on the plot