| Literature DB >> 34804999 |
Yicong Li1, Rui Wang2, Huihui Wang1, Feiyang Pu1, Xili Feng1, Li Jin1, Zhongren Ma1, Xiao-Xia Ma1.
Abstract
Synonymous codon usage bias is a universal characteristic of genomes across various organisms. Autophagy-related gene 13 (atg13) is one essential gene for autophagy initiation, yet the evolutionary trends of the atg13 gene at the usages of nucleotide and synonymous codon remains unexplored. According to phylogenetic analyses for the atg13 gene of 226 eukaryotic organisms at the nucleotide and amino acid levels, it is clear that their nucleotide usages exhibit more genetic information than their amino acid usages. Specifically, the overall nucleotide usage bias quantified by information entropy reflected that the usage biases at the first and second codon positions were stronger than those at the third position of the atg13 genes. Furthermore, the bias level of nucleotide 'G' usage is highest, while that of nucleotide 'C' usage is lowest in the atg13 genes. On top of that, genetic features represented by synonymous codon usage exhibits a species-specific pattern on the evolution of the atg13 genes to some extent. Interestingly, the codon usages of atg13 genes in the ancestor animals (Latimeria chalumnae, Petromyzon marinus, and Rhinatrema bivittatum) are strongly influenced by mutation pressure from nucleotide composition constraint. However, the distributions of nucleotide composition at different codon positions in the atg13 gene display that natural selection still dominates atg13 codon usages during organisms' evolution.Entities:
Keywords: autophagy-related gene 13; nucleotide composition distribution; nucleotide usage; phylogenetic analyses; synonymous codon usage
Mesh:
Substances:
Year: 2021 PMID: 34804999 PMCID: PMC8602353 DOI: 10.3389/fcimb.2021.771010
Source DB: PubMed Journal: Front Cell Infect Microbiol ISSN: 2235-2988 Impact factor: 5.293
Figure 1Phylogenetic analysis of atg13 genes of 226 eukaryotic organisms. This phylogenetic tree was constructed by the Maximum Composite Likelihood model with UPGMA statistical method at the nucleotide level. Branches leading to these groups of sequences were supported by 100% of bootstrap replications in all cases.
Figure 2Phylogenetic analysis of ATG13 protein sequences of 226 eukaryotic organisms. This phylogenetic tree was constructed by the Maximum Composite Likelihood model with UPGMA statistical method at the nucleotide level. Branches leading to these groups of sequences were supported by 100% of bootstrap replications in all cases.
Figure 3The patterns of nucleotide usage bias for atg13 genes of 226 eukaryotic organisms are represented by information entropy. The ‘N’ means the overall nucleotide usage bias across the complete gene, The ‘N1’, ‘N2’, and ‘N3’ stand for the patterns of nucleotide usage bias at the first, second, and third codon positions, respectively. The red box with broken lines highlights the outliers which correspond to atg13 genes of the specific organisms. *** means p-value < 0.001.
Figure 4The usage bias for the specific nucleotide (A, T, G, or C) in the atg13 genes of 226 eukaryotic organisms represented by information entropy. The red box with broken lines highlights the outliers which correspond to atg13 genes of the specific organisms. *** means p-value < 0.001.
The dispersion magnitude of synonymous codon usages for atg13 genes across all species in this study.
|
|
|
| |||
|---|---|---|---|---|---|
| TTT(F) | 0.158 | CCT(P) | 0.180 | AAA(K) | 0.237 |
| TTC(F) | 0.430 | CCC(P) | 0.234 | AAG(K) | 0.183 |
| TTA(L) | 0.480 | CCA(P) | 0.290 | GAT(D) | 0.227 |
| TTG(L) | 0.339 | CCG(P) | 0.903 | GAC(D) | 0.162 |
| CTT(L) | 0.265 | ACT(T) | 0.250 | GAA(E) | 0.220 |
| CTC(L) | 0.267 | ACC(T) | 0.223 | GAG(E) | 0.197 |
| CTA(L) | 0.718 | ACA(T) | 0.339 | TGT(C) | 0.249 |
| CTG(L) | 0.242 | ACG(T) | 0.574 | TGC(C) | 0.544 |
| ATT(I) | 0.275 | GCT(A) | 0.203 | CGT(R) | 0.300 |
| ATC(I) | 0.304 | GCC(A) | 0.245 | CGC(R) | 0.883 |
| ATA(I) | 0.368 | GCA(A) | 0.337 | CGA(R) | 0.496 |
| GTT(V) | 0.244 | GCG(A) | 1.093 | CGG(R) | 0.571 |
| GTC(V) | 0.274 | TAT(Y) | 0.306 | AGA(R) | 0.351 |
| GTA(V) | 0.496 | TAC(Y) | 0.222 | AGG(R) | 0.264 |
| GTG(V) | 0.196 | CAT(H) | 0.296 | GGT(G) | 0.370 |
| TCT(S) | 0.188 | CAC(H) | 0.287 | GGC(G) | 0.206 |
| TCC(S) | 0.215 | CAA(Q) | 0.473 | GGA(G) | 0.303 |
| TCA(S) | 0.293 | CAG(Q) | 0.159 | GGG(G) | 0.286 |
| TCG(S) | 0.966 | AAT(N) | 0.266 | ||
| AGT(S) | 0.428 | AAC(N) | 0.244 | ||
| AGC(S) | 0.206 |
Some synonymous codons are absent from atg13 genes from different species.
| Synonymous codon/Amino acid | Species |
|---|---|
| UUA(L) |
|
| CUA(L) |
|
| CUG(L) |
|
| AUU(I) |
|
| AUA(I) |
|
| GUC(V) |
|
| GUA(V) |
|
| UCG(S) |
|
| CCG(P) |
|
| ACG(T) |
|
| GCG(A) |
|
| UAU(Y) |
|
| UGU(C) |
|
| UGC(C) |
|
| CGC(R) |
|
| CGA(R) |
|
| CGG(R) |
|
| AGA(R) |
|
Figure 5The relationship between ENC and GC content at the third synonymous codon position (GC3) in atg13 genes of 226 eukaryotic organisms. The continuous curve line displays the expected codon usage if GC compositional constraints alone account for codon usage bias. The red broken lines highlight the distribution of agt13 genes in this plot of ENC vs. GC3 content. In addition, the red box with broken lines highlights the outliers which correspond to atg13 genes of the specific organisms.
Figure 6Parity rule 2 bias plot for atg13 genes of 226 eukaryotic organisms.
Figure 7The plot for the overall codon usage visualized by PCA method for atg13 genes of 226 eukaryotic organisms. The red dots mean mammals, the blue dots stand for Sauropsida animals, the purple dots mean fishes, the green dots mean amphibians, and the black dots correspond to the rest eukaryotic organisms in this study.
The correlations between the overall nucleotide skew and the nucleotide skew at the specific codon position at the gene level.
| GC1 skew | GC2 skew | GC3 skew | |
|---|---|---|---|
|
| r=0.398*** | r=0.448*** | r=0.606*** |
|
|
|
| |
|
| r=0.071 | r=0.587*** | r=0.888*** |
|
|
|
| |
|
| r=0.862*** | r=0.638*** | r=0.939*** |
|
|
|
| |
|
| r=0.859*** | r=0.681*** | r=-0.293*** |
|
|
|
| |
|
| r=0.669*** | r=0.618*** | r=0.956*** |
|
|
|
| |
|
| r=0.834*** | r=0.559*** | r=0.971*** |
***p < 0.001, NSp > 0.05. NS, non-significance.