Literature DB >> 34834965

2A and 2A-like Sequences: Distribution in Different Virus Species and Applications in Biotechnology.

Juliana G S de Lima1,2, Daniel C F Lanza1,2.   

Abstract

2A is an oligopeptide sequence that mediates a ribosome "skipping" effect and can mediate a co-translation cleavage of polyproteins. These sequences are widely distributed from insect to mammalian viruses and could act by accelerating adaptive capacity. These sequences have been used in many heterologous co-expression systems because they are versatile tools for cleaving proteins of biotechnological interest. In this work, we review and update the occurrence of 2A/2A-like sequences in different groups of viruses by screening the sequences available in the National Center for Biotechnology Information database. Interestingly, we reported the occurrence of 2A-like for the first time in 69 sequences. Among these, 62 corresponded to positive single-stranded RNA species, six to double stranded RNA viruses, and one to a negative-sense single-stranded RNA virus. The importance of these sequences for viral evolution and their potential in biotechnological applications are also discussed.

Entities:  

Keywords:  2A peptide; Picornaviridae; Totiviridae; double-stranded RNA virus; positive-sense single-stranded RNA virus

Mesh:

Substances:

Year:  2021        PMID: 34834965      PMCID: PMC8623073          DOI: 10.3390/v13112160

Source DB:  PubMed          Journal:  Viruses        ISSN: 1999-4915            Impact factor:   5.048


1. Introduction

2A and 2A-like sequences are oligopeptides with approximately 18–25 amino acids and can mediate a co-translation “cleavage” of polyproteins in eukaryotic cells. The “core” sequence at the C-terminus of 2A, together with the N-terminal proline of the downstream protein, contains the canonical motif—(G/H)1D2(V/I)3E4X5N6P7G8P9—involved in a ribosome “skipping” effect during translation, which separates two proteins without needing a proteinase [1,2]. The 2A cleavage occurs between the G8 site at the upstream protein (P1) and the P9 site at the downstream protein (P2). During amino acid insertion into the protein, the 2A sequence can cause a structural modification at the ribosome peptidyl-transferase center (PTC), making the ribosome “skip” the proline codon. It inhibits the formation of a glycine-proline peptide bond because of the hydrolysis of the peptidyl (2A)-tRNAGly ester linkage, releasing the polypeptide from the translational complex [3,4]. In this way, the first amino acid, proline, of the downstream encoded protein, is specified by the third codon in the sequence of P7G8P9, and the C-terminal amino acid of the upstream encoded protein is a glycine encoded by the second codon in that sequence [5,6]. This ribosome “skipping” effect is also referred to as “Stop-Carry On” or “StopGo” translation [6]. Thus, the ribosome activity does not depend on structural elements within the mRNA but a peptide sequence, differentiating this mechanism from the other forms of non-canonical mRNA processing. Because of this activity, the 2A and 2A-like sequences can be named CHYSELs (cis-acting hydrolase elements) [7]. Originally, the term “2A” was assigned to define a specific region of the genome of the foot-and-mouth disease virus (FMDV), a positive-sense single-stranded RNA (pssRNA) virus and member of the Picornaviridae family [1,4,8,9,10]. Similar sequences discovered in other viruses were named “2A-like.” These sequences have been described in other Picornaviridae, such as Equine rhinitis A virus and Porcine teschovirus-1, in other viruses of the Dicistroviridae and Iflaviviridae families [2], and even in the infectious myonecrosis virus (IMNV), a double-stranded RNA (dsRNA) virus belonging to the Totiviridae family [11]. From these first discoveries, the 2A and 2A-like proteolytic cleavage activities have been demonstrated in several eukaryotic systems in vitro and in vivo [2,12]. Because of their mechanism of action, some authors also refer to 2A and 2A-like peptides as cis-acting hydrolase elements [7,13]. In 2017, Yang et al. reviewed the 2A sequence structures and functions of Picornaviridae members [14]. The latest works analyzing 2A and 2A-like sequences, including viruses from other families, were conducted by Luke et al. in 2008, 2009, and 2014 and by Luke and Ryan in 2013 [2,15,16,17]. With advances in sequencing technology, in recent years, there has been a significant increase in the number of viral sequences added to the National Center for Biotechnology Information (NCBI) database. Therefore, the goal of this article was to introduce a new screening of 2A and 2A-like sequences in viral genomes available from the NCBI database to revise the principal 2A and 2A-like sequences, describe their occurrence in different viral families, and discuss their potential applications in biotechnology.

2. Materials and Methods

The sequences used in this study were obtained from the viral databank (https://www.ncbi.nlm.nih.gov/genome/viruses/, accessed on 9 January 2021). To find 2A/2A-like sequences, the viral genomes were aligned against some of the 2A/2A-like classical motifs (GDVEENPGP; GDVESNPGP; HDIETNPGP; GDVELNPGP; GDIELNPGP; GDIESNPGP; HDVEMNPGP) using the Blastp tool (https://blast.ncbi.nlm.nih.gov/Blast.cgi, accessed on 9 January 2021) and the non-redundant protein sequences database (nr) only including viruses (taxid:10239). Search parameters were set to return a maximum of 500 sequences for each query. Repeated viral sequences were excluded from the analysis. An active search was performed on the publication linked to the sequence annotation in the NCBI database to identify whether the sequences found had already been reported in the literature after the initial screening. If no report was found, an active search was performed using the Google Scholar search tool, with each respective virus name plus the word “2A” as keywords. If no articles reported the presence of 2A/2A-like in the query virus, we considered this finding novel.

3. Results and Discussion

3.1. 2A/2A-Like Distribution on Viruses

Table 1 shows the principal 2A or 2A-like motifs that had their self-cleavage efficiencies tested in vitro, confirming that these sequences are widely distributed among the pssRNA and dsRNA viruses, ranging from insect to mammalian viruses. Luke et al. were the first to report this wide distribution and identified motifs similar to those found in the FMDV [2].
Table 1

Principal 2A/2A-like motifs described in literature and their cleavage efficiency.

VirusFamilyMotifCleavage EfficiencyReferences
Euprosterna elaeasa virus (EeV)AlphatetraviridaeGDVEENPGP~99%[2,18]
Providence virus (PrV)AlphatetraviridaeGDVESNPGP~99%[2]
Providence virus (PrV)AlphatetraviridaeGDIEKNPGP~94%[2]
Providence virus (PrV)AlphatetraviridaeGDVEKNPGP~99%[2]
Thosea asigna virus (TaV)AlphatetraviridaeGDVEENPGP~99%[1]
Acute bee paralysis virus (ABPV)DicistroviridaeGDVETNPGP~94%[1,2]
Cricket paralysis virus (CrPV)DicistroviridaeGDVESNPGP~90%[1,2]
Drosophila C virus (DCV)DicistroviridaeGDVETNPGP~95%[1]
Ectropis oblique picorna-like virus (EoPV)IflaviridaeGDVESNPGP~99%[2,19]
Ectropis oblique picorna-like virus (EoPV)IflaviridaeGDIESNPGP~99%[2,19]
Infectious flacherie virus (IFV)IflaviridaeAGIESNPGP~99%[1,2]
Perina nuda picorna-like virus (PnPV)IflaviridaeGDVESNPGP~99%[2,20]
Perina nuda picorna-like virus (PnPV)IflaviridaeGDIESNPGP~99%[2,20]
Encephalomyocarditis virus (EMCV)PicornaviridaeHDIETNPGP~91%[1,8]
Equine rhinitis A virus (ERAV)PicornaviridaeGDVESNPGP~99%[1,21]
Equine rhinitis B virus (ERBV-1)PicornaviridaeGDVELNPGP~99%[2,22]
Foot-and-mouth disease virus (FMDV)PicornaviridaeGDVESNPGP~99%[8,10]
Ljungan virus (LV)PicornaviridaeGDVETNPGP~99%[2,23]
Porcine teschovirus 1 (PTV-1)PicornaviridaeGDVEENPGP~94%[1,24]
Saffold virus (SAF-V)PicornaviridaeHDVETNPGP~99%[2,25]
Theiler’s murine encephalomyelitis virus (TMEV)PicornaviridaeHDVEMNPGP~99%[10]
Bombyx mori reoviridae 1 (BmCPV-1)ReoviridaeGDIESNPGP~99%[2,26]
Human reoviridae C (HurV-C)ReoviridaeGDIELNPGP~82%[2]
New adult diarrhea virus (ADRV-N)ReoviridaeECIESNPGP~97%[2,27]
Operophtera brumata reoviridae 18 (OpbuCPV-18)ReoviridaeGDVESNPGP~99%[2]
Porcine reoviridae A (Porv-C)ReoviridaeGDVELNPGP~89%[1,2]
Infectious myonecrosis virus (IMNV)Unassigned TotiviridaeGDVESNPGP~99%[2,11]
Infectious myonecrosis virus (IMNV)Unassigned TotiviridaeGDVEENPGP~99%[2,11]
The search for these motifs in the viral genomes available in the NCBI database revealed 69 sequences containing 2A-like motifs that had not been identified. Among these, 62 corresponded to pssRNA, six to dsRNA, and one to a negative-sense single-stranded RNA (nssRNA) virus. Additionally, 2A-like motifs, previously described in 102 sequences, were confirmed. All 2A/2A-like motifs and their respective species resulting from the search are described in Table 2 and Table 3.
Table 2

Positive-sense single-stranded RNA virus containing 2A-like motifs.

Accession NumberVirus2A MotifTaxon
YP_003620399.1Providence virus—2A1GDVEKNPGPCarmotetraviridae
Providence virus—2A2GDVESNPGP
Providence virus—2A3GDIEKNPGP
NP_066241.1 Acute bee paralysis virus GDVETNPGPDicistroviridae
YP_009252204.1 Anopheles C virus GDVELNPGPDicistroviridae
NP_647481.1 Cricket paralysis virus GDVESNPGPDicistroviridae
NP_044945.1 Drosophila C virus GDVETNPGPDicistroviridae
AMO03208.1 Empeyrat virus GDVELNPGPDicistroviridae
YP_008888535.1 Formica exsecta virus 1 GDIESNPGPDicistroviridae
YP_009221981.1 Goose dicistrovirus GDVELNPGPDicistroviridae
ASS83246.1 Israeli acute paralysis virus GDVEENPGPDicistroviridae
NP_851403.1 Kashmir bee virus GDIELNPGPDicistroviridae
YP_009011065.1 Fusarium graminearum hypovirus 1 HDVEKNPGPHypoviridae
YP_009361829.1Diamond back moth iflavirus2A1GDVESNPGPIflaviridae
Diamond back moth iflavirus2A2GDVESNPGP
NP_919029.1Ectropis obliqua picorna-like virus—2A1GDVESNPGPIflaviridae
Ectropis obliqua picorna-like virus—2A2GDIESNPGP
NP_277061.1Perina nuda virus—2A1GDVESNPGPIflaviridae
Perina nuda virus—2A2GDIESNPGP
YP_009010984.1 Spodoptera exigua iflavirus 2 GDVESNPGPIflaviridae
NP_573542.1 Euprosterna elaeasa virus GDVEENPGPPermutotetraviridae
AAC97195.1 Thosea asigna virus GDVEENPGPPermutotetraviridae
AXF38648.1Avihepatovirus sp.2A1GDVESNPGPPicornaviridae
Avihepatovirus sp.2A2GDVESNPGP
Avihepatovirus sp.2A3GDVEPNPGP
Avihepatovirus sp.2A4GDVESNPGP
AUX16868.1 Avisivirus AVE052/AsV GDIEENPGPPicornaviridae
YP_009345900.1 Bat crohivirus GDIESNPGPPicornaviridae
YP_006607894.1Bluegill picornavirus—2A1GDVESNPGPPicornaviridae
Bluegill picornavirus—2A2GDVEQNPGP
YP_006792625.1 Bovine hungarovirus 1 GDVELNPGPPicornaviridae
YP_009116874.1 Bovine picornavirus GDIESNPGPPicornaviridae
AQX17368.1 Bovine rhinitis B virus GDIESNPGPPicornaviridae
ANN02879.1 Bovine rhinitis B virus GDIETNPGPPicornaviridae
YP_009352243.1 Bovine rhinovirus 1 GDVETNPGPPicornaviridae
QEQ92497.1 Burpengary virus GDVEQNPGPPicornaviridae
ACG61138.2 Cardiovirus D HDIETNPGPPicornaviridae
AEJ86360.1 Cardiovirus Hu/SIDS-347/DEU/2010 HDIETNPGPPicornaviridae
YP_008992026.1Carp picornavirus 1—2A1GDVEQNPGPPicornaviridae
Carp picornavirus 1—2A2GDVESNPGP
QMI57967.1 Chestnut teal aalivirus GDVEENPGPPicornaviridae
YP_002956074.1 Cosavirus A GDIESNPGPPicornaviridae
YP_002956076.1 Cosavirus D GDIETNPGPPicornaviridae
YP_009361830.1 Cosavirus F GDVEENPGPPicornaviridae
YP_009104360.1 Crohivirus GDIESNPGPPicornaviridae
YP_009345900.1 Crohivirus B GDIESNPGPPicornaviridae
YP_009026377.1Duck picornavirus GL/12—2A1GDVESNPGPPicornaviridae
Duck picornavirus GL/12—2A2GDVEENPGP
Duck picornavirus GL/12—2A3GDVEMNPGP
Duck picornavirus GL/12—2A4GDIEQNPGP
AAA43035.1 Encephalomyocarditis virus HDIETNPGPPicornaviridae
AKE44318.1 Encephalomyocarditis virus HDVETNPGPPicornaviridae
AGU38152.1 Encephalomyocarditis virus HDVELNPGPPicornaviridae
AFO66759.1 Encephalomyocarditis virus type 2 HDVETNPGPPicornaviridae
NP_653077.1 Equine rhinitis B virus 1 GDVELNPGPPicornaviridae
ANJ20934.1 Equine rhinitis B virus 2 GDVESNPGPPicornaviridae
ANJ20932.1 Erbovirus A GDVESNPGPPicornaviridae
ANJ20933.1 Erbovirus A GDVELNPGPPicornaviridae
YP_009423853.1Falcon picornavirus—2A1GDVEENPGPPicornaviridae
Falcon picornavirus—2A2GDVELNPGP
AHL26986.1Fathead minnow picornavirus—2A1GDVEQNPGPPicornaviridae
Fathead minnow picornavirus—2A2GDVESNPGP
AYJ71467.2 Feline hunnivirus GDVELNPGPPicornaviridae
AAT01719.1Foot-and-mouth disease virus—type AGDVESNPGPPicornaviridae
AFM56034.1Foot-and-mouth disease virus—type OGDVESNPGPPicornaviridae
AAT01787.1Foot-and-mouth disease virus—type SAT 1GDVESNPGPPicornaviridae
AFE84748.1Foot-and-mouth disease virus—type SAT 2GDVESNPGPPicornaviridae
AAT01795.1Foot-and-mouth disease virus—type SAT 3GDVESNPGPPicornaviridae
AIB06813.1 Genet fecal theilovirus HDVEMNPGPPicornaviridae
YP_009026376.1 Human cosavirus GDIETNPGPPicornaviridae
AFJ04537.1 Human cosavirus A20 GDIESNPGPPicornaviridae
YP_002956075.1 Human cosavirus B HDIETNPGPPicornaviridae
ADF28539.1 Human TMEV-like cardiovirus HDIETNPGPPicornaviridae
AMT85188.1 Hunnivirus GDVEENPGPPicornaviridae
YP_009118270.1 Lesavirus 2 GDIEPNPGPPicornaviridae
ACJ48052.1 Ljungan virus GDVEENPGPPicornaviridae
AVX29482.1Marmot mosavirus—2A1GDVETNPGPPicornaviridae
Marmot mosavirus—2A2GDVETNPGP
ANX14418.1 Mengo virus HDVETNPGPPicornaviridae
YP_009361319.1 Miniopterus schreibersii picornavirus 1 GDVEENPGPPicornaviridae
AWC68493.1 Mischivirus B GDIEENPGPPicornaviridae
YP_009026384.1 Mosavirus A2 GDVESNPGPPicornaviridae
YP_009109563.1 Norway rat hunnivirus GDVELNPGPPicornaviridae
ADO85550.2 Ovine hungarovirus GDVELNPGPPicornaviridae
AIU94297.1 Pasivirus A GDVEQNPGPPicornaviridae
SNQ28005.1 Pasivirus A GDIEQNPGPPicornaviridae
APA29021.1 Picornaviridae sp. rodent GDVELNPGPPicornaviridae
ADN52625.1 Porcine encephalomyocarditis virus HDIETNPGPPicornaviridae
AAK12398.1 Porcine teschovirus 1 GDVEENPGPPicornaviridae
AAK12413.1 Porcine teschovirus 10 GDVEENPGPPicornaviridae
AAK12390.1 Porcine teschovirus 11 GDVEENPGPPicornaviridae
AAK12381.1 Porcine teschovirus 2 GDVEENPGPPicornaviridae
AAK12382.1 Porcine teschovirus 3 GDVEENPGPPicornaviridae
AGB67759.1 Porcine teschovirus 4 GDVEENPGPPicornaviridae
ACT66681.1 Porcine teschovirus 5 GDVEENPGPPicornaviridae
AAK12409.1 Porcine teschovirus 6 GDVEENPGPPicornaviridae
AAK12386.1 Porcine teschovirus 7 GDVEENPGPPicornaviridae
AAK12388.1 Porcine teschovirus 9 GDVEENPGPPicornaviridae
QHX40840.1 Porcine teschovirus 22 GDIEENPGPPicornaviridae
ACD67870.1 Rat theilovirus 1 HDVETNPGPPicornaviridae
AWK02689.1 Rattus tanezumi hunnivirus GDVEENPGPPicornaviridae
AWK02688.1Rattus tanezumi parechovirus2A1GDVEENPGPPicornaviridae
Rattus tanezumi parechovirus2A2GDVEENPGP
ACO92353.1 Saffold virus HDIETNPGPPicornaviridae
YP_001210296.2 Saffold virus HDVETNPGPPicornaviridae
APZ85840.1 Senecavirus A GDIETNPGPPicornaviridae
AHW57724.1 Sikhote-Alin virus HDVEMNPGPPicornaviridae
AUK47911.1 Swine pasivirus SPaV1/US/17-50816IA60467-1/2001 GDVEQNPGPPicornaviridae
BAU71153.1 Swine picornavirus GDVEENPGPPicornaviridae
NP_653143.1 Teschovirus A GDVEENPGPPicornaviridae
ACG55799.1 Theiler’s encephalomyelitis virus HDVETNPGPPicornaviridae
BAC58035.1 Theiler’s-like virus of rats HDVETNPGPPicornaviridae
AIY68187.1 Tortoise picornavirus GDVEVNPGPPicornaviridae
AIY68186.1 Tortoise picornavirus GDVEQNPGPPicornaviridae
ACG55801.1 Vilyuisk human encephalomyelitis virus HDVEMNPGPPicornaviridae
AVM87411.1 Yili teratoscincus roborowskii picornavirus 2 GDVEQNPGPPicornaviridae
YP_009329817.1 Bivalve RNA virus G1 GDVETNPGPUnassigned Dicistroviridae
QNL09596.1 Clinch dicistro-like virus 2—2A1 GDVEMNPGPUnassigned Dicistroviridae
Clinch dicistro-like virus 2—2A2 GDVETNPGP
QJI52079.1 Dicistroviridae sp. GDVEMNPGPUnassigned Dicistroviridae
AYQ66681.1 Drosophila kikkawai virus 1 GDVELNPGPUnassigned Dicistroviridae
YP_009336571.1 Hubei diptera virus 1 GDVELNPGPUnassigned Dicistroviridae
YP_009336583.1 Hubei picorna-like virus 16 GDVELNPGPUnassigned Dicistroviridae
YP_009336853.1 Hubei picorna-like virus 17 GDVELNPGPUnassigned Dicistroviridae
QKF95572.1 Leibnitzia anandria dicistrovirus GDIEENPGPUnassigned Dicistroviridae
AXA52579.1 Linepithema humile virus 1 GDIELNPGPUnassigned Dicistroviridae
QIU80542.1 Phenacoccus solenopsis virus GDIEENPGPUnassigned Dicistroviridae
YP_009336743.1 Wenling crustacean virus 3 GDVEENPGPUnassigned Dicistroviridae
YP_009333180.1 Wenling picorna-like virus 2 GDIELNPGPUnassigned Dicistroviridae
YP_009342327.1 Wuhan insect virus 11 GDIEANPGPUnassigned Dicistroviridae
YP_009329857.1 Beihai hepe-like virus 4 GDIESNPGPUnassigned Hepeviridae
QDY81493.1 Bipolaris oryzae hypovirus 1 GDVEANPGPUnassigned Hypoviridae
YP_009337372.1 Hubei picorna-like virus 43 GDIESNPGPUnassigned Iflaviridae
QKN89050.1Iflaviridae sp.2A1GDVESNPGPUnassigned Iflaviridae
Iflaviridae sp.2A2GDIESNPGP
AWK77896.1 Perth bee virus 3 GDVETNPGPUnassigned Iflaviridae
YP_009336821.1 Wenzhou picorna-like virus 49 HDVELNPGPUnassigned Iflaviridae
AVM87450.1 Guangdong spotted longbarbel catfish picornavirus—2A1 GDVEENPGPUnassigned Picornavirales
Guangdong spotted longbarbel catfish picornavirus—2A2 GDIESNPGP
Guangdong spotted longbarbel catfish picornavirus—2A3 GDVERNPGP
ASG92543.1 Picornavirales Q_sR_OV_036 GDVEANPGPUnassigned Picornavirales
ASG92538.1 Picornavirales Q_sR_OV_042 GDIEENPGPUnassigned Picornavirales
ATY47693.1 Picornavirales sp. GDVEENPGPUnassigned Picornavirales
ATY47707.1 Picornavirales sp. GDVELNPGPUnassigned Picornavirales
AWK02666.1 Rhinolophus sinicus picornavirus GDIEENPGPUnassigned Picornavirales
QQP18688.1 Soybean thrips picorna-like virus 7 GDVETNPGPUnassigned Picornavirales
AWK02669.1 Suncus murinus picornavirus GDVETNPGPUnassigned Picornavirales
AWK77886.1 Victoria bee virus 1 GDVETNPGPUnassigned Picornavirales
AWK77887.1 Victoria bee virus 2 GDIETNPGPUnassigned Picornavirales
AVM87443.1 Wenling thamnaconus septentrionalis picornavirus GDIESNPGPUnassigned Picornavirales
AVM87419.1 Western African lungfish picornavirus GDVEENPGPUnassigned Picornavirales
AVM87438.1Wuhan carp picornavirus2A1GDVESNPGPUnassigned Picornavirales
Wuhan carp picornavirus2A2GDVESNPGP
Wuhan carp picornavirus2A3GDVESNPGP
ANN02882.1 Bovine rhinitis B virus 5 GDVETNPGPUnassigned Picornaviridae
AQM40272.1 Human cosavirus (Cosavirus-zj-1) GDVEENPGPUnassigned Picornaviridae
AWG94399.1 Human cosavirus E/D GDVEENPGPUnassigned Picornaviridae
AVX29481.1 Marmot cardiovirus HDVETNPGPUnassigned Picornaviridae
AWK02672.1 Niviventer confucianus hunnivirus GDVELNPGPUnassigned Picornaviridae
AFV31450.1 Parechovirus-like virus GDVEQNPGPUnassigned Picornaviridae
QBH68005.1 Parechovirus sp. QAPp32 GDVEENPGPUnassigned Picornaviridae
QKE55061.1 Picornaviridae sp. GDIEENPGPUnassigned Picornaviridae
QKE55028.1Picornaviridae sp.2A1GDVESNPGPUnassigned Picornaviridae
Picornaviridae sp.2A2GDVEQNPGP
Picornaviridae sp.2A3GDVESNPGP
QIM74091.1 Picornaviridae sp. HDVETNPGPUnassigned Picornaviridae
YP_009336671.1 Wenzhou picorna-like virus 48—2A1 GDIEENPGPUnassigned Picornaviridae
Wenzhou picorna-like virus 48—2A2 GDIESNPGP
Wenzhou picorna-like virus 48—2A3 GDIEENPGP
AZT88626.1 Aspergillus homomorphus yadokarivirus 1 GDIEENPGPUnassigned pssRNA
APG77930.1 Beihai picorna-like virus 76 GDVETNPGPUnassigned pssRNA
YP_009333551.1 Beihai picorna-like virus 85 GDVETNPGPUnassigned pssRNA
AYN75548.1 Halhan virus 1 GDVEQNPGPUnassigned pssRNA
AZT88627.1 Penicillium digitatum yadokarivirus 1 GDVETNPGPUnassigned pssRNA
QOI17269.1 Picoa juniperi yado-kari virus 1 GDIESNPGPUnassigned pssRNA
QHD64758.1 Plasmopara viticola lesion associated yadokari virus 1 GDIEENPGPUnassigned pssRNA
QIJ25855.1 Warroolaba Creek virus 2 GDVETNPGPUnassigned pssRNA
AVD68673.2 Yado-kari virus 2 GDVEENPGPUnassigned pssRNA

Underlined names correspond to sequences that had no 2A sequence described before this study.

Table 3

Double-stranded RNA viruses identified in this study containing 2A-like motifs.

Accession NumberVirus2A MotifTaxon
AAU88188.1 Adult diarrhea virus ECIESNPGPReoviridae
BAB20437.1 Bombyx mori cypovirus 1 GDIESNPGPReoviridae
BAO73973.1 Bovine rotavirus C GDVELNPGPReoviridae
AAO32344.1 Dendrolimus punctatus cypovirus 1 GDVESNPGPReoviridae
BAU80889.1 Human rotavirus C GDIELNPGPReoviridae
AAK73524.1 Lymantria dispar cypovirus 1 GDVESNPGPReoviridae
ABB17215.1 Operophtera brumata cypovirus 18 GDVESNPGPReoviridae
BAV31546.1 Porcine rotavirus C GDVELNPGPReoviridae
QBJ02264.1 Porcine rotavirus H GDVELNPGPReoviridae
AQX34666.1 Rotavirus I GDIESNPGPReoviridae
CCD33025.1 Aspergillus foetidus slow virus 2 GDIEENPGPUnassigned dsRNA
YP_009272910.1 Fusarium poae mycovirus 2 GDIEENPGPUnassigned dsRNA
YP_009182156.1 Penicillium aurantiogriseum asp-foetidus like virus 1 GDIEENPGPUnassigned dsRNA
YP_009342431.1 Wuhan insect virus 31—2A1 GDVELNPGPUnassigned dsRNA
Wuhan insect virus 31—2A2 GDVERNPGP
YP_003934933.1 Armigeres subalbatus GDVESNPGPUnassigned Totiviridae
YP_009256208.1 Golden shiner totivirus GDIESNPGPUnassigned Totiviridae
AIC34742.2Penaeid shrimp infectious myonecrosis virus—2A1GDVESNPGPUnassigned Totiviridae
Penaeid shrimp infectious myonecrosis virus—2A2GDVEENPGP
YP_009337085.1 Wenling toti-like virus 2 GDIETNPGPUnassigned Totiviridae
YP_009333269.1 Wenzhou toti-like virus 1 GDVEMNPGPUnassigned Totiviridae

Underlined names correspond to new findings.

3.2. pssRNA Viruses

Here, we registered 62 new 2A-like notifications in pssRNA viruses, as presented in Table 2 (underlined). The positions in each respective genome are shown in Figure 1.
Figure 1

Schematic representation of positive-sense single-strand RNA virus sequences. Schematic representations of pssRNA virus sequences showing the location of each respective 2A-like (yellow rectangles). The nucleotide positions and size of each predicted polypeptide are represented by the numbers below and above the bars, respectively. The annotations of each viral sequence were included according to the NCBI. The nucleotide and protein accession numbers are presented forward and above each scheme, respectively. Representations of each genome are not in scale. This figure is presented in four parts.

In most pssRNA viruses, 2A/2A-like segments are used in primary polypeptide processing. The pssRNA viruses commonly possess one 2A/2A-like sequence, but some viruses have two, three, or even four motifs (Table 2). Many of them are members of the order Picornavirales, such as Picornaviridae, Dicistroviridae, and Iflaviridae. Currently, the Picornaviridae family has 63 assigned genera [28], but 2A/2A-like sequences have been found in viruses assigned or tentatively assigned to 15 genera: Aphthovirus, Avihepatovirus, Cardiovirus, Cosavirus, Crohivirus, Erbovirus, Hunninvirus, Limnipivirus, Mischivirus, Mosavirus, Parechovirus, Pasivirus, Senecavirus, Teschovirus, and Torchivirus. In aphthoviruses and cardioviruses, the 2A-like region self-cleaves at its own C-terminus, meaning that the 2A-like polypeptide remains as a C-terminal extension of the upstream polyprotein (P1) until it is removed by secondary proteinase cleavage [8,9]. However, in parechoviruses, the 2A-like region has no protease or protease-like activity, and its apparent function is to alter host cell metabolism because it possesses a high homology to cellular protein H-rev107 that regulates cell proliferation (H-box 2A) [29]. In insect Iflaviruses, the 2A-like sequence separates the capsid and replicative protein domains. The Dicistroviridae family is composed of the Aparavirus, Cripavirus, and Triatovirus genera, in which the 2A-like sequences occur at the N-terminal region of the replicative protein open reading frame (ORF) [2,14]. Members of the Permutotetraviridae and Carmotetraviridae families (previously Tetraviridae), Thosea asigna virus and Euprosterna elaeasa virus, encode a 2A-like sequence at the N-terminus of the structural ORF [1]. The Providence virus has three 2A-like sequences, 2A2 and 2A3, located in the capsid protein precursor (VCAP), and 2A1 at the N-terminus of the p130 ORF, which encodes the viral replicase [30].

3.3. dsRNA Viruses

Among the dsRNA viruses, 2A-like sequences not yet reported were found in six species. The new 2A-like sequences are underlined in Table 3, and their localization inside the genome is schematized in Figure 2.
Figure 2

Schematic representation of double-stranded RNA virus sequences. Schematic representations of dsRNA virus sequences showing the location of each respective 2A-like (yellow rectangles). The nucleotide positions and size of each predicted polypeptide are represented by the numbers below and above the bars, respectively. The annotations of each viral sequence were made according to the information available at the NCBI. The nucleotide and protein accession numbers are located forward and above each scheme, respectively. Representations of each genome are not in scale.

In double-stranded viruses, 2A-like sequences are present in two families: Totiviridae and Reoviridae. In Totiviridae, 2A-like sequences are distributed in all representatives of the IMNV-like group [31]. These viruses predominantly infect arthropods, such as penaeid shrimp [32], mosquitoes [33,34], and the fruit fly Drosophila melanogaster [35], except for the golden shiner Totivirus that infects the fish Notemigonus crysoleucas [36]. The genome of IMNV-like viruses is composed of two ORFs, and the 2A-like sequences separate an RNA-binding protein of other putative proteins in ORF1 [37]. In the Reoviridae family, 2A-like sequences are found in cypoviruses and rotaviruses with 2A-like sequences in one of the segments encoding a non-structural protein. In Operophtera brumata cypovirus 18 and Bombyx mori cypovirus 1, 2A-like sequences occur within segment 5. In type C rotaviruses, 2A-like sequences link the ssRNA-binding protein NSP3 to dsRNA-binding protein (dsRBP). In porcine and human rotavirus C, the 2A-like sequences are present at segment 6, although in the adult diarrhea virus, the sequence appears in segment 5 [1,2]. All cypoviruses and rotaviruses possess only one 2A-like sequence (Table 3).

3.4. nssRNA Virus

Surprisingly, one 2A-like motif (GDIEQNPGP) was found in a tentatively assigned virus of the Bunyaviridae family (Accession number: APG79245.1). This motif is located in the RNA-dependent RNA polymerase (RdRp) sequence (Figure 3). This is the first report of a 2A-like sequence in a nssRNA virus.
Figure 3

Schematic representation of a negative-sense single-strand RNA virus sequence. Schematic representations of nssRNA virus sequence showing the location of its respective 2A-like sequence (yellow rectangle). The nucleotide positions and size of the predicted polypeptide are represented by the numbers below and above the bars, respectively. The annotations of the viral sequence were made according to NCBI. The nucleotide and protein accession numbers are located forward and above the scheme, respectively. Representation of the genome are not to scale.

3.5. 2A/2A-Likes Sequences and Viral Evolution

Previous studies concerning RNA viruses and 2A-like peptides have reported that these sequences emerged independently during the evolution of viral families [2,14]. However, in a previous study [31], we showed sequences very similar to functional 2A-like sequences in some RNA viruses that could be the precursors of 2A sequences. In particular, RNA viruses depend on the activity of RNA-dependent RNA polymerases. These enzymes have a significant error rate (10−3 to 10−5 mutations per inserted nucleotide) because they do not have exonucleotide review activity [38]. This results in a high degree of genetic heterogeneity in populations of RNA viruses, which are believed to favor adaptability to different environments and hosts [39]. Considering this, the 2A/2A-like sequences could have emerged by subsequent mutation events that ended in a cleavage function, providing the advantage of releasing more than one protein from the same ORF. Therefore, this could directly impact viral adaptation potential and viral infection mechanisms to favor their fitness in complex multicellular systems [31]. Yang et al. also suggested that picornaviruses with more complex infection mechanisms than other viruses of the same family have more than one 2A-like sequence in their genomes [14]. Taking this evidence into account, it seems that 2A/2A-like sequences may be a key element in viral genome evolution and, once acquired, its loss of function may impact virus effectivity.

3.6. Biotechnology Applications

Various approaches have been employed to co-express multiple proteins in cells, including the use of internal ribosomal entry site (IRES) elements [40,41], dual promoter systems [42,43], and transfection of multiple vectors [44]. Each of these is associated with several limitations, such as uneven or unreliable protein expression levels, silencing of some promoters [45,46], and increased toxicity to cells (with multiple transfections) [47]. Co-expression systems, including 2A/2A-like peptides, could be an alternative strategy for expressing multiple genes under the control of a single promoter. These constructs could have the additional advantage of producing proteins at near-stoichiometric levels, unlike IRES-mediated polycistronic expression, where ribosomes are independently recruited at distinct regions with the mRNA [1,4,48,49]. This necessitates the optimization of the system by testing several combinations of promoters and/or IRES and the order of genes within the expression cassette [46]. Furthermore, IRES activity can be affected by cell type, and variable expression can be observed in the downstream coding sequence [50]. 2A/2A-like sequences have been used in a range of heterologous expression systems because of their cleavage capacity. These systems include viruses [51], yeasts [52,53], fungi [54,55,56], insect cells [57,58], plants [59], human HTK-143 cells [9], rabbit reticulocytes [60], HeLa cells [61], CHO cells [62], HEK293 cells [63], algae [64], and other animals [65,66,67]. In yeasts, more than two 2A sequences have been used to co-express proteins from the same vector. As seen in [68] and [69], three proteins were produced using this strategy in S. cerevisiae. Surprisingly, up to nine proteins have been linked and successfully co-translated and separated with 2A sequences in the yeast Pichia pastoris [70]. Researchers have also attempted to use 2A for multi-gene transformation in staple crops [71,72]. They can also be used for gene fusion, as seen in tomatoes, potatoes, and others [73,74]. To construct the co-expression vectors, the 2A/2A-like sequences are usually incorporated into an adenovirus [75], adeno-associated virus (AAV) [12], retrovirus [76], lentivirus [77,78], or plasmid vector [79,80]. Many other biotechnological applications that depend on the co-expression of multiple genes use 2A/2A-like sequences, e.g., the production of antibodies and antigens that can be used in vaccine production [80,81,82,83,84,85], observation of chromatin dynamics and genome (DNA and RNA) editing in the application of cell/gene therapies [78,79,86,87,88,89,90], and development of optogenetic tools [91,92,93]. More examples of viral 2As applications can be found in [94].

4. Conclusions

In this article, we reviewed the 2A/2A-like sequence distribution of viruses and described the occurrence of these motifs in viral species where these sequences have not been previously reported. These findings need to be confirmed through in vitro tests to verify they are active 2A-like sequences. Because of its cleavage function, the 2A/2A-like sequences appear to directly affect the complexity of the viral genome, which plays a decisive role in viral evolution. Additionally, they are excellent alternatives for developing new biotechnological tools that depend on the expression of multiple products, such as vaccines, transgenic approaches, cell/gene therapy, and optogenetic tools.
  87 in total

1.  The 2A proteins of three diverse picornaviruses are related to each other and to the H-rev107 family of proteins involved in the control of cell proliferation.

Authors:  P J Hughes; G Stanway
Journal:  J Gen Virol       Date:  2000-01       Impact factor: 3.891

2.  New insights about ORF1 coding regions support the proposition of a new genus comprising arthropod viruses in the family Totiviridae.

Authors:  Márcia Danielle A Dantas; Gustavo Henrique O Cavalcante; Raffael A C Oliveira; Daniel C F Lanza
Journal:  Virus Res       Date:  2015-10-20       Impact factor: 3.303

3.  A picornaviral 2A-like sequence-based tricistronic vector allowing for high-level therapeutic gene expression coupled to a dual-reporter system.

Authors:  Mark J Osborn; Angela Panoskaltsis-Mortari; Ron T McElmurry; Scott K Bell; Dario A A Vignali; Martin D Ryan; Andrew C Wilber; R Scott McIvor; Jakub Tolar; Bruce R Blazar
Journal:  Mol Ther       Date:  2005-09       Impact factor: 11.454

4.  Comprehensive Analysis of Genomic Safe Harbors as Target Sites for Stable Expression of the Heterologous Gene in HEK293 Cells.

Authors:  Seunghyeon Shin; Su Hyun Kim; Sung Wook Shin; Lise Marie Grav; Lasse Ebdrup Pedersen; Jae Seong Lee; Gyun Min Lee
Journal:  ACS Synth Biol       Date:  2020-06-02       Impact factor: 5.110

5.  Analysis of the aphthovirus 2A/2B polyprotein 'cleavage' mechanism indicates not a proteolytic reaction, but a novel translational effect: a putative ribosomal 'skip'.

Authors:  Michelle L L Donnelly; Garry Luke; Amit Mehrotra; Xuejun Li; Lorraine E Hughes; David Gani; Martin D Ryan
Journal:  J Gen Virol       Date:  2001-05       Impact factor: 3.891

6.  Reversible optogenetic control of kinase activity during differentiation and embryonic development.

Authors:  Vishnu V Krishnamurthy; John S Khamo; Wenyan Mei; Aurora J Turgeon; Humza M Ashraf; Payel Mondal; Dil B Patel; Noah Risner; Ellen E Cho; Jing Yang; Kai Zhang
Journal:  Development       Date:  2016-10-03       Impact factor: 6.868

7.  Bicistronic expression and differential localization of proteins in insect cells and Drosophila suzukii using picornaviral 2A peptides.

Authors:  Jonas Schwirz; Ying Yan; Zdenek Franta; Marc F Schetelig
Journal:  Insect Biochem Mol Biol       Date:  2020-01-21       Impact factor: 4.714

8.  The palm subdomain-based active site is internally permuted in viral RNA-dependent RNA polymerases of an ancient lineage.

Authors:  Alexander E Gorbalenya; Fiona M Pringle; Jean-Louis Zeddam; Brian T Luke; Craig E Cameron; James Kalmakoff; Terry N Hanzlik; Karl H J Gordon; Vernon K Ward
Journal:  J Mol Biol       Date:  2002-11-15       Impact factor: 5.469

9.  Occurrence, function and evolutionary origins of '2A-like' sequences in virus genomes.

Authors:  Garry A Luke; Pablo de Felipe; Alexander Lukashev; Susanna E Kallioinen; Elizabeth A Bruno; Martin D Ryan
Journal:  J Gen Virol       Date:  2008-04       Impact factor: 3.891

10.  Improvement of reporter activity by IRES-mediated polycistronic reporter system.

Authors:  Hicham Bouabe; Reinhard Fässler; Jürgen Heesemann
Journal:  Nucleic Acids Res       Date:  2008-02-11       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.