| Literature DB >> 31004987 |
Abstract
Overlapping genes represent an intriguing puzzle, as they encode two proteins whose ability to evolve is constrained by each other. Overlapping genes can undergo "symmetric evolution" (similar selection pressures on the two proteins) or "asymmetric evolution" (significantly different selection pressures on the two proteins). By sequence analysis of 75 pairs of homologous viral overlapping genes, I evaluated their accordance with one or the other model. Analysis of nucleotide and amino acid sequences revealed that half of overlaps undergo asymmetric evolution, as the protein from one frame shows a number of substitutions significantly higher than that of the protein from the other frame. Interestingly, the most variable protein (often known to interact with the host proteins) appeared to be encoded by the de novo frame in all cases examined. These findings suggest that overlapping genes, besides to increase the coding ability of viruses, are also a source of selective protein adaptation.Entities:
Keywords: Ancestral frame; De novo frame; Homologs; Non-synonymous nucleotide substitution; Selection pressure; Synonymous nucleotide substitution; Virus adaptation
Mesh:
Substances:
Year: 2019 PMID: 31004987 PMCID: PMC7125799 DOI: 10.1016/j.virol.2019.03.017
Source DB: PubMed Journal: Virology ISSN: 0042-6822 Impact factor: 3.616
Fig. 1Orientation of overlapping genes, with the downstream frame having a shift of one nucleotide 3′ with respect to the upstream frame. There are 3 types of codon position (cp): cp13 (bold character), in which the first position of the upstream frame overlaps the third position of the downstream frame; cp21 (underlined character), in which the second position of the upstream frame overlaps the first position of the downstream frame; cp32 (italic character), in which the third position of the upstream frame overlaps the second position of the downstream frame. Based on the genetic code, a nucleotide substitution at first codon position causes an amino acid change in 95.4% of cases, at second codon position in 100% of cases, and at third codon position in 28.4% of cases. Thus, nucleotide substitutions at the codon positions “13” and “32” are usually non-synonymous in one frame and synonymous in the other. Nucleotide substitutions at the codon position “21” are almost all non-synonymous in both frames.
Fig. 2Analysis of the amino acid diversity in the 75 pairs of homologous overlapping genes. Each pair of columns shows: i) the percent amino acid identity between the protein encoded by the upstream frame of the overlap and that encoded by the homolog (dark column); ii) the percent amino acid identity between the protein encoded by the downstream frame of the overlap (shifted of one nucleotide 3′ with respect to the upstream frame) and that encoded by the homolog (gray column). The horizontal line separates well-conserved homologous pairs (aa identity >50%) from not well-conserved homologous pairs (aa identity <50%). (A) Subset of the 37 overlapping genes under symmetric evolution. (B) Subset of the 38 overlapping genes under asymmetric evolution. The numbering of overlapping genes is in accordance with that given in Supplementary Table S1. The underlined numbers indicate the overlaps in which the pattern of symmetric evolution (4 cases out of 37) or that of asymmetric evolution (6 cases out of 38) was not confirmed by chi-square analysis of the nucleotide diversity.
List of the 32 overlapping genes evolving in accordance with the asymmetric model.
| Genome ac. number (homolog) | Virus species | Overlapping gene | Chi-square analysis of amino acid substitutions | Chi-square analysis of nucleotide substitutions | Most variable protein |
|---|---|---|---|---|---|
| NC_001366 (EU542581) | polyprotein/L* | 11.60 | 6.79 | L* | |
| NC_004102 (JQ061474) | polyprotein/F (ARFP) | 36.27 | 25.11 | F | |
| NC_002021 (CY109232) | RdRp (subunit PB1)/PB1-F2 | 32.36 | 20.61 | PB1-F2 | |
| NC_002022 (KY614903) | RdRp (subunit PA)/PA-X | 6.83 | 8.06 | PA-X | |
| NC_001498 (KM089831) | V/phosphoprotein (P) | 10.80 | 8.60 | phosphoprotein | |
| NC_001552 (KF687311) | phosphoprotein (P)/C′ | 18.52 | 9.01 | phosphoprotein | |
| NC_024473 (JX121105) | phosphoprotein (P)/C′ | 4.50 | 4.35 | C′ | |
| NC_008311 (JQ658375) | capsid protein (VP1)/VF1 | 20.57 | 17.50 | VF1 | |
| NC_003627 (JX286709) | capsid protein/p31 | 6.07 | 4.85 | p31 | |
| NC_002568 (MSBMVCCG) | Px/polyprotein P2ab (protease domain) | 7.33 | 4.03 | Px | |
| NC_001749 (KC310737) | 2B*/polyprotein | 8.85 | 11.72 | 2B* | |
| NC_006008 (KP821839) | capsid protein (VP6)/NS4 | 20.49 | 15.07 | capsid protein | |
| NC_001409 (NC_006946) | movement protein/capsid protein | 9.50 | 3.91 | movement protein | |
| NC_001749 (EU553489) | movement protein (36 kd)/polyprotein (linker domain) | 113.83 | 69.34 | polyprotein (linker domain) | |
| NC_005224 (NC_005227) | nucleocapsid protein/non-structural protein NSs | 5.72 | 3.85 | non-structural protein NSs | |
| NC_001427 (NC_015396) | capsid protein (VP2)/apoptin (VP3) | 6.26 | 4.96 | apoptin | |
| NC_004674 (KC795968) | replication associated protein (Rep, AC1)/AC4 | 12.88 | 11.82 | AC4 | |
| NC_001412 (NC_015051) | movement protein (V3)/V2 | 4.56 | 4.57 | movement protein | |
| NC_001401 (KP733795) | capsid protein (VP1)/AAP (Assembly Activating Protein) | 10.87 | 5.80 | AAP | |
| NC_001401 (AY530620) | capsid protein (VP1)/X protein | 9.20 | 13.09 | X protein | |
| NC_014126 (KU885997) | p130/replicase (p104) | 102.76 | 104.30 | p130 | |
| NC_001554 (NC_007729) | p19/p22 | 6.60 | 6.75 | p19 | |
| NC_003608 (DQ392986) | p28/p23 | 15.09 | 11.90 | p23 | |
| NC_003608 (DQ392986) | capsid protein/p25 | 15.00 | 14.62 | p25 | |
| NC_004366 (NC_027710) | movement protein (ORF3)/movement protein (ORF4) | 40.04 | 33.30 | ORF3 | |
| NC_004063 (JQ001816) | movement protein (p69)/replicase | 76.32 | 60.61 | movement protein | |
| NC_001915 (NC_030242) | VP5/polyprotein | 23.62 | 19.81 | VP5 | |
| NC_011505 (JX416217) | phosphoprotein (NSP5)/NSP6 | 4.09 | 4.38 | NSP6 | |
| NC_001841 (KU877879) | P1N-PISPO/polyprotein | 35.86 | 17.07 | P1N-PISPO | |
| NC_001549 (JN662633) | vif protein/vpx protein | 6.46 | 5.78 | vif protein | |
| NC_001607 (AF136236) | X protein/phosphoprotein (P) | 6.51 | 6.69 | X protein | |
| NC_006497 (GU830910) | P6 (ORF2)/P7 (ORF1) | 31.16 | 27.78 | P6 |
Fig. 3Correlation between the normalized chi-square value (from analysis of amino acid substitutions) and the absolute value (Abs) of the difference between the percent frequency (%F) of nucleotide substitutions at the codon position “32” (%F.cp32) and that at the codon position “13” (%F.cp13). Empty circles indicate the 33 overlapping genes under symmetric evolution. Black circles indicate the 32 overlapping genes under asymmetric evolution.
List of the 23 overlapping genes with known genealogy and evolving in accordance with the asymmetric model.
| Virus species and genome ac. number | Overlapping gene (protein products) | Most variable protein | Length of overlapping and non-overlapping part of the | ||
|---|---|---|---|---|---|
| polyprotein (leader protein, 72 aa; capsid protein VP4; 71 aa; C-end of capsid protein VP2; 13 aa)/L* | L* | L* (suppressor of interferon response) | 156 aa; 0 aa | Phylogeny and codon usage | |
| polyprotein (core protein, 151 aa)/F (ARFP) | F (ARFP) | F (ARFP) (suppressor of interferon response) | 151 aa; 0 aa | Codon usage | |
| RNA-dependent RNA polymerase (subunit PB1)/PB1-F2 | PB1-F2 | PB1-F2 (suppressor of interferon response; apoptosis facor)) | 87 aa; 0 aa | Phylogeny and codon usage | |
| RNA-dependent RNA polymerase (subunit PA)/PA-X | PA-X | PA-X (degradation of host mRNA) | 61 aa; 0 aa | Codon usage | |
| nucleocapsid protein/non-structural protein NSs | non-structural protein NSs | non-structural protein NSs (suppressor of interferon response) | 90 aa; 0 aa | Codon usage | |
| VP5/polyprotein (N-end half of capsid protein VP2, 131 aa) | VP5 | VP5 (suppressor of interferon response) | 131 aa; 0 aa | Phylogeny and codon usage | |
| X protein/phosphoprotein (P) | X protein | X protein (antagonist of interferon response) | 71 aa; 16 aa | Codon usage | |
| P6 (ORF2)/P7 (ORF1) | P6 (ORF2) | P6 (ORF2) (antagonist of interferon response) | 183 aa; 51 aa | Codon usage | |
| capsid protein (VP1)/VF1 (virulence factor 1) | VF1 (virulence factor 1) | VF1 (antagonist of interferon response; apoptosis factor) | 213 aa; 0 aa | Phylogeny and codon usage | |
| movement protein/capsid protein | movement protein | movement protein (suppressor of RNA silencing) | 105 aa; 355 aa | Phylogeny | |
| p19/p22 | p19 | p19 (suppressor of RNA silencing) | 172 aa; 0 aa | Phylogeny | |
| movement protein (p69)/replicase (C-end region, 63 aa; methyltransferase domain; 156 aa; downstream region, 407 aa) | p69 | p69 (suppressor of RNA silencing) | 626 aa; 0 aa | Phylogeny and codon usage | |
| replication associated protein AC1 (two-thirds C-end of DNA binding domain; 77 aa)/AC4 | AC4 | AC4 (suppressor of RNA silencing) | 77 aa; 0 aa | Phylogeny | |
| capsid protein (VP2)/apoptin (VP3) | apoptin | apoptin (apoptosis factor) | 119 aa; 0 aa | Phylogeny | |
| capsid protein/p31 | p31 | p31 | 149 aa; 69 aa | Phylogeny and codon usage | |
| movement protein (ORF3)/movement protein (ORF4) | ORF3 | ORF3 | 220 aa; 17 aa | Phylogeny and codon usage | |
| capsid protein/p25 | p25 | p25 | 224 aa; 0 aa | Phylogeny and codon usage | |
| capsid protein (VP1)/AAP (Assembly Activating Protein) | AAP | AAP | 169 aa; 35 aa | Phylogeny and codon usage | |
| capsid protein (VP1)/X protein | X protein | X protein | 155 aa; 0 aa | Codon usage | |
| phosphoprotein (NSP5)/NSP6 | NSP6 | NSP6 | 92 aa; 0 aa | Codon usage | |
| p28/p23 | p23 | p23 | 209 aa; 0 aa | Phylogeny and codon usage | |
| movement protein (36 kd)/polyprotein (linker domain) | linker domain | linker domain | 320 aa; 0 aa | Phylogeny and codon usage | |
| p130/replicase (p104) | p130 | p130 | 893 aa; 327 aa | Phylogeny |