| Literature DB >> 25866904 |
Bradley J Blitvich1, Andrew E Firth2.
Abstract
There has been a dramatic increase in the number of insect-specific flaviviruses (ISFs) discovered in the last decade. Historically, these viruses have generated limited interest due to their inability to infect vertebrate cells. This viewpoint has changed in recent years because some ISFs have been shown to enhance or suppress the replication of medically important flaviviruses in co-infected mosquito cells. Additionally, comparative studies between ISFs and medically important flaviviruses can provide a unique perspective as to why some flaviviruses possess the ability to infect and cause devastating disease in humans while others do not. ISFs have been isolated exclusively from mosquitoes in nature but the detection of ISF-like sequences in sandflies and chironomids indicates that they may also infect other dipterans. ISFs can be divided into two distinct phylogenetic groups. The first group currently consists of approximately 12 viruses and includes cell fusing agent virus, Kamiti River virus and Culex flavivirus. These viruses are phylogenetically distinct from all other known flaviviruses. The second group, which is apparently not monophyletic, currently consists of nine viruses and includes Chaoyang virus, Nounané virus and Lammi virus. These viruses phylogenetically affiliate with mosquito/vertebrate flaviviruses despite their apparent insect-restricted phenotype. This article provides a review of the discovery, host range, mode of transmission, superinfection exclusion ability and genomic organization of ISFs. This article also attempts to clarify the ISF nomenclature because some of these viruses have been assigned more than one name due to their simultaneous discoveries by independent research groups.Entities:
Mesh:
Year: 2015 PMID: 25866904 PMCID: PMC4411683 DOI: 10.3390/v7041927
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Figure 1Phylogenetic tree for genus Flavivirus. Complete polyprotein amino acid sequences were aligned using MUSCLE [14]. Regions of ambiguous alignment were excised using Gblocks [15] with default parameters, after which 1774 amino acid positions were retained. A maximum likelihood phylogenetic tree was estimated using the Bayesian Markov chain Monte Carlo method implemented in MrBayes version 3.2.3 [16] sampling across the default set of fixed amino acid rate matrices, with 10 million generations, discarding the first 25% as burn-in. The figure was produced using FigTree (http://tree.bio.ed.ac.uk/software/figtree/). The tree is midpoint-rooted, nodes are labelled with posterior probability values, and branches are also highlighted with alternative colors. Species names are color-coded as follows: cISFs—blue; dISFs—green; NKV flaviviruses—red; mosquito/vertebrate flaviviruses—purple; tick/vertebrate flaviviruses—black.
Geographic distribution and natural host range of classical insect-specific flaviviruses.
| a Virus | Isolate Available | Geographic Distribution | Natural Host Range | References |
|---|---|---|---|---|
| Aedes flavivirus (AEFV) | Yes | Japan (2003), Italy (2008), USA (2011), b Thailand (2012) |
| [ |
| Aedes galloisi flavivirus (AGFV) | Yes | Japan (2003) |
| [ |
| Calbertado virus (CLBOV) | Yes | Canada (2003), USA (2006) |
| [ |
| Cell fusing agent virus (CFAV) | Yes | Laboratory (1975), Puerto Rico (2002), Indonesia (2004), Mexico (2007), Thailand (2008), bUnited States (2012) | [ | |
| Culex flavivirus (CxFV) | Yes | Japan (2003), Indonesia (2004), China (2006), Guatemala (2006), USA (2006), Mexico (2007), Trinidad (2008), Uganda (2008), Argentina (2009) |
| [ |
| c Culex theileri flavivirus (CTFV) | Yes | Spain (2006), Portugal (2009–2010), Greece (2010), Thailand (date not specified) |
| [ |
| d Hanko virus (HANKV) | Yes | Finland (2005), Spain (2006), Italy (ca. 2007), Portugal (ca. 2007) |
| [ |
| Kamiti River virus (KRV) | Yes | Kenya (1999) |
| [ |
| Nakiwogo virus (NAKV) | Yes | Uganda (2008) |
| [ |
| e Nienokoue virus(NIEV) | f Yes | Cote d’Ivoire (2004) | (Genbank Accession No. NC_024299) | |
| Palm Creek virus (PCV) | Yes | Australia (2010) |
| [ |
| Quang Binh virus (QBV) | Yes | Vietnam (2002), China (2009) |
| [ |
a This table is restricted to viruses for which more than 300 nt of sequence data are available; b AEFV and CFAV were isolated from laboratory colonies established in 2012 from mosquitoes collected in Thailand and the U.S.A, respectively; c Also known as Spanish Culex flavivirus (SCxFV) and Wang Thong virus (WTV); d Also known as Ochlerotatus flavivirus (OcFV), Spanish Ochlerotatus flavivirus (SOcFV) and Ochlerotatus caspius flavivirus from Portugal (OCFVPT); e An acronym has not been assigned to Nienokoue virus and therefore, for the purpose of this review, NIEV will be used; f Information provided in the Genbank database implies that an isolate is available for NIEV.
Summary of sequence data available for classical insect-specific flaviviruses.
| Virus | Sequence Data Available | Length of Genome (nt) | Length of 5’ UTR (nt) | Length of 3’ UTR (nt) | a Genbank Accession No. |
|---|---|---|---|---|---|
| Aedes flavivirus | Genome | 11,064 | 96 | 945 | NC_012932 |
| Aedes galloisi flavivirus | Partial NS5 | b - | - | - | AB639347 |
| Calbertado virus | Partial NS5 | - | - | - | EU569288 |
| Cell fusing agent virus | Genome | 10,695 | 113 | 556 | NC_001564 |
| Culex flavivirus | Genome | 10,834 | 91 | 657 | NC_008604 |
| Culex theileri flavivirus | ORF | - | - | - | HE574574 |
| Hanko virus | ORF | - | - | - | JQ268258 |
| Kamiti River virus | Genome | 11,375 | 96 | 1205 | NC_005064 |
| Nakiwogo virus | ORF | - | - | - | GQ165809 |
| Nienokoue virus | ORF | - | - | - | NC_024299 |
| Palm Creek virus | ORF | - | - | - | KC505248 |
| Quang Binh virus | Genome | 10,865 | 112 | 673 | NC_012671 |
a If multiple sequences have been deposited into the Genbank database, usually the Genbank accession number corresponding to the prototype isolate or longest sequence is shown; b Data not available.
Figure 2Phylogenetic tree for selected cISF partial NS5 sequences. A 795-nt region of NS5 corresponding to nt 8916-9710 of M91671.1 (CFAV) was used in order to include CLBOV, for which only partial NS5 sequences are available. The corresponding amino acid sequences were aligned with MUSCLE [14] and this amino acid alignment was used to guide a nucleotide sequence alignment. A maximum likelihood phylogenetic tree was estimated using the Bayesian Markov chain Monte Carlo method implemented in MrBayes version 3.2.3 [16] using the general time reversible (GTR) substitution model with gamma-distributed rate variation across sites and a proportion of invariable sites. Chains were run for 10 million generations, with the first 25% discarded as burn-in. The figure was produced using FigTree (http://tree.bio.ed.ac.uk/software/figtree/). Based on the full-genus tree (Figure 1), HANKV was selected as an outgroup to root the tree. Nodes are labelled with posterior probability values and poorly supported branches are also highlighted with alternative colors. Tips are labelled with isolate names as provided in original publications or, if unpublished, in sequence records. Species (as defined in this review) are grouped (vertical black bars) and annotated at right.
Figure 4Predicted frameshift-stimulatory RNA structures in ISFs. (A) Frameshifting in cISFs is predicted to be stimulated by an RNA pseudoknot structure in the CFAV clade, and an RNA stem-loop structure in the CxFV and HANKV clades; (B) Frameshifting in the CHAOV clade of dISFs is predicted to be stimulated by an RNA stem-loop structure.
Figure 3Predicted -1 frameshift sites in ISFs. (A) Apparently all cISFs contain a -1 PRF site just downstream of the predicted junction between the regions encoding NS1 and NS2A. Frameshifting results in translation of a long overlapping ORF, termed fifo. The ‘slippery’ heptanucleotide sequence at which the -1 nt shift occurs is highlighted in orange, with nucleotide variations highlighted in pink. Ribosomes that shift -1 nt read the last nucleotide of the heptanucleotide twice. Predicted frameshift stimulatory elements (an RNA pseudoknot structure in the CFAV clade and an RNA stem-loop structure in the HANKV and CxFV clades) are annotated: nucleotides predicted to be involved in base-pairing interactions are colored and underlined, and predicted base-pairings are indicated with “()”s and “[]”s (see also Figure 4). Conserved positions are indicated with “*”s. The length (in codons) of the fifo ORF in each sequence is indicated at right; (B) There is strong comparative genomic evidence that members of the dISF clade encompassing ILOV, CHAOV, LAMV and DONV contain a functionally utilized -1 PRF site towards the 3' end of the region encoding NS2B.
Predicted cleavage sites in the polyproteins of cISFs.
| Junction | AEFV | CFAV | CxFV | CTFV | HANKV | KRV | NAKV | NIEV | PCV | QBV | Dual-Host Flaviviruses |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Virion C/Anch | b LEAQR↓SHSPV | c LESRR↓TTGNP | d LEAKR↓SAKNA | LEVRR↓SANNP | LEKER↓SHPRK | e LEKQR↓SGPNL | LEKRR↓GVWSP | LEQRR↓GAQRG | LEKKR↓DGRAA | LENRR↓SANPL | After dibasic residues |
| C/prM | b GLALS↓ETLRY | j VLCGC↓VVIDM | n MMVLG↓AVVID | VLCGC↓VIIDM | IVVTG↓LSIEL | e GLCYG↓EMLRY | VGIFS↓LNVVD | MVTFA↓AVVDV | FGVMG↓VVVID | TLCGT↓MVIDM | Signalase-like cleavage |
| pr/M | b PRKRR↓SSPQR | KREKR↓SREPP | d KRERR↓VASTN | KRVKR↓APETP | ERETR↓QKVDD | e VRRRR↓APQPQ | NRKQR↓SVKDE | RPVRR↓DVTPA | TRAKR↓VAPDG | KRVKR↓ATEQP | Furin |
| prM/E | b NVVRA↓TSIEP | j TTVKG↓EFVEP | d TTVKG↓EFVEP | TTVKG↓EFVEP | NVVKG↓EFVEP | e NVVKA↓SSIEP | TTVRG↓EFMEP | TTVSG↓EYLEP | TTVRG↓EYMEP | STVKG↓EFVEP | Signalase-like cleavage |
| E/NS1 | f RRVAG↓DIGCG | c YYVRA↓DLGCG | d VYTKA↓DVGCG | YFARA↓DVGCG | VYVKA↓DVGCG | e RSVSA↓DVGCG | YTVRA↓DFGCG | YYVRA↓DVGCG | YFVRA↓DFGCG | YYTRA↓DVGCG | Signalase-like cleavage |
| NS1/NS2A | b GKADA↓TADFH | c GKANA↓QSDFR | ° PPVEG↓SYPDF | PGTGA↓FPDFQ | YRVPS↓TNAED | e GKAHA↓CSDFR | PPSGA↓EKLQQ | GGAEA↓TQSFF | PMGET↓AKIQN | PGAEA↓LLQDF | Signalase-like cleavage |
| NS2A/NS2B | g KSSYR↓TSGRS | k RNGYR↓DSGAN | p RSGLR↓ASRRS | KSGLR↓ASKSS | RSGYR↓ALCSS | s KNGYR↓DYGAS | ASGLR↓KPRPH | KSGLR↓SITSW | GDGLR↓APRPH | KSGLR↓ASKRS | After dibasic residues |
| NS2B/NS3 | b NEHCR↓SDDLL | c TASNR↓SDDLL | q VSVFR↓SNEVN | STAYR↓AGVND | TNAFR↓SDELI | e SEQNR↓SDDLL | EFAQR↓SSSEL | STAQR↓SDLLL | AMSQR↓ANSEL | TSNRR↓SGVND | After dibasic residues |
| NS3/NS4A | h YINTR↓SSASL | l YMNCR↓GGPTL | r YLKQR↓SNFNF | FLKQR↓SGANF | YMGTR↓SFLSV | t YLNCR↓SSQTF | FLKQR↓SVLPF | FLKQR↓SLFID | FLKQR↓SLYFD | FLKQR↓SVLNF | After dibasic residues |
| NS4A/2K | AAGNR↓SYLDS | SIGNR↓SYMDS | NNVHR↓AYTTD | NNVHR↓AYTGD | SAGQR↓SYVDI | AIGNR↓SYMDS | GGSQR↓GILDS | ANSQR↓GFAEN | GGSQR↓GVLDS | TNVHR↓AYTGD | After dibasic residues |
| 2K/NS4B | b CSVLA↓WEMRL | c CGVLA↓WEMRM | d MGVVA↓WEMDL | MGVVA↓WELNL | IGVIC↓WELRL | e CGVLA↓WEMRL | IGIAA↓WELQL | SAVVA↓WELNL | IGVTA↓WELEL | MGIVA↓WELEL | Signalase-like cleavage |
| NS4B/NS5 | i FSKFR↓ALEKS | m FNQFR↓ALEKS | dRMALR↓SLVKT | RGGLR↓SLVKT | NITTR↓SLEKS | u FNQFR↓ALEKS | RLSVR↓SLVKS | LDMRR↓SLMKT | RLGVR↓SLVKS | RLATR↓SLVKT | After dibasic residues |
a Genbank Accession numbers for the sequences used in this analysis are listed in Table 2; b Consistent with the AEFV polyprotein cleavage sites proposed by [19]; c Consistent with the CFAV polyprotein cleavage sites proposed by [18]; d Consistent with the CxFV polyprotein cleavage sites proposed by [31]; e Consistent with the KRV polyprotein cleavage sites proposed by [44]; f 1 residue downstream of the AEFV E/NS1 cleavage site proposed by [19]; g 25 residues upstream of the AEFV NS2A/NS2B cleavage site proposed by [19]; h 5 residues downstream of the AEFV NS3/NS4A cleavage site proposed by [19]; i 1 residue upstream of the AEFV NS4B/NS5 cleavage site proposed by [19]; j Cleavage at this junction was experimentally verified by amino-terminal sequencing [18]; k 25 residues upstream of the CFAV NS2A/NS2B cleavage site proposed by [18]; l 10 residues downstream of the CFAV NS3/NS4A cleavage site proposed by [18]; m One residue upstream of the CFAV NS4B/NS5 cleavage site proposed by [18]; n 1 residue upstream of the CxFV C/prM cleavage site proposed by [31]; ° 24 residues downstream of the CxFV NS1/NS2A cleavage site proposed by [31] but consistent with [87]; p 4 residues upstream of the CxFV NS2A/NS2B cleavage site proposed by [31]; q 15 residues downstream of the CxFV NS2B/NS3 cleavage site proposed by [31]; r 10 residues downstream of the CxFV NS3/NS4A cleavage site proposed by [31]; s 25 residues upstream of the KRV NS2A/NS2B cleavage site proposed by [44]; t 10 residues downstream of the KRV NS3/NS4A cleavage site proposed by [44]; u 1 residue upstream of the KRV NS4B/NS5 cleavage site proposed by [44].
Geographic distribution and natural host range of dual-host affiliated insect-specific flaviviruses.
| Virus | Isolate Available | Geographic Distribution | Natural Host Range | References |
|---|---|---|---|---|
| Barkedji virus (BJV) | No | Senegal (date not reported), Israel (2011) |
| [ |
| Chaoyang virus (CHAOV) | Yes | China (2008), South Korea (2003) |
| [ |
| Donggang virus (DONV) | a Yes | China (2009) |
| (Genbank Accession No. NC_016997) |
| Ilomantsi virus (ILOV) | Yes | Finland (2007) | Most likely | [ |
| Lammi virus (LAMV) | Yes | Finland (2004) |
| [ |
| Marisma mosquito virus (MMV) | Yes | Spain (2003), Italy (2011) |
| [ |
| Nanay virus (NANV) | Yes | Peru (2009) |
| [ |
| Nhumirim virus (NHUV) | Yes | Brazil (2010) |
| [ |
| Nounané virus (NOUV) | Yes | Côte d'Ivoire (2004) |
| [ |
a Information provided in the Genbank database implies that an isolate is available for DONV.
Summary of sequence data available for dual–host affiliated insect-specific flaviviruses.
| Virus | Sequence Data Available | Length of Genome (nt) | Length of 5’ UTR (nt) | Length of 3’ UTR (nt) | a Genbank Accession No. |
|---|---|---|---|---|---|
| Barkedji virus | Almost entire ORF | b - | - | - | KC496020 |
| Chaoyang virus | Genome | 10,733 | 99 | 326 | NC_017086 |
| Donggang virus | Genome | 10,791 | 113 | 343 | NC_016997 |
| Ilomantsi virus | ORF | - | - | - | NC_024805 |
| Lammi virus | ORF | - | - | - | KC692068 |
| Marisma mosquito virus | Partial NS5 | - | - | - | JN603190 |
| Nanay virus | Partial E and NS5 | - | - | - | JX627335 |
| Nhumirim virus | Genome | 10,891 | 102 | 451 | NC_024017 |
| Nounané virus | ORF | - | - | - | EU159426 |
a If multiple sequences have been deposited into the Genbank database, the accession number corresponding to the prototype isolate and/or longest sequence is shown in most instances; b Data not available.
Figure 5Relative UpA and CpG frequencies in different flavivirus species. UpA and CpG frequencies were calculated in two different ways. (A) In each sequence, the numbers of UpA and CpG dinucleotides, and A, C, G and U mononucleotides, were counted. Dinucleotide frequencies, fXpY, were expressed relative to their expected frequencies, fX x fY, in the absence of selection; (B) Since codon usage reflects dinucleotide bias but can also be subject to other selective pressures (e.g., for translational speed or accuracy) that, due to co-evolution of dinucleotide and codon preferences in the host, may lead to the same dinucleotide biases, we also calculated dinucleotide biases independent of codon (and amino acid) usage. To factor out codon and amino acid usage, 1000 shuffled ORF sequences were generated for each virus sequence. In each shuffled sequence, the original amino acid sequence and the original total numbers of each of the 61 codons were maintained, but synonymous codons were randomly shuffled between the different sites where the corresponding amino acid is used in the original sequence. Then the UpA and CpG frequencies in the original sequence were expressed relative to their mean frequencies in the codon-shuffled sequences. Because codon usage is factored out, the UpA and CpG relative frequencies tend to be less extreme in (B) compared to (A). Since many sequences lack complete UTRs, for consistency, both analyses of all species were restricted to the polyprotein ORF. Each point represents a single flavivirus sequence. Points and selected species names are color-coded as follows: cISFs—blue; dISFs—green; NKV flaviviruses—red; mosquito/vertebrate flaviviruses—purple; tick/vertebrate flaviviruses—black. GenBank accession numbers are the same as those used in Figure 1.
Predicted cleavage sites in the polyproteins of dISFs.
| Junction | BJV | CHAOV | DONV | ILOV | LAMV | NHUV | NOUV | Dual-Host Flaviviruses |
|---|---|---|---|---|---|---|---|---|
| Virion C/Anch | b KTSKR↓GLQQS | RKAKR↓SVTTP | RPNRR↓SAGSN | QKTRR↓SVDTV | KNGKR↓SKTEI | c RRARR↓GMGIP | d VSKRR↓GSASL | After dibasic residues |
| C/prM | b TMAAC↓ATLGM | CMAYG↓ATRFT | GTAMA↓ATSMT | VAVIA↓TTVTT | GTAMA↓ASMFT | b TMVAC↓VTVGT | d GVASA↓VTFTT | Signalase-like cleavage |
| pr/M | b RRSKR↓SVAIA | RRSRR↓SVALA | RRSRR↓SIMIP | RRSRR↓SIALA | RRGKR↓SVALA | b RRSRR↓SVALS | d QRSRR↓SVGIS | Furin |
| prM/E | b APAYS↓LHCSR | GPAYS↓LQCID | APVYG↓SQCSG | APVYG↓HHCSG | GPAYS↓LQCVD | b APAYS↓THCVR | d IPAYS↓MKCIG | Signalase-like cleavage |
| E/NS1 | b TTVAG↓DVGCN | TVGVS↓EIGCS | TNAVS↓EVGCS | SAAAS↓EVGCS | TVALS↓EVGCS | b TSAHA↓EVGCS | c TSVSA↓ELGCS | Signalase-like cleavage |
| NS1/NS2A | b SWTTA↓GNATG | SKVSA↓GTFQG | ARVSA↓GAVHG | ARVSA↓GLVAG | SKVSA↓GTFQG | b SWVTA↓GQMTG | e SWVSA↓GEPMV | Signalase-like cleavage |
| NS2A/NS2B | b GSGKR↓SVSMG | SSGKR↓SWPAG | KHGKR↓SWPAG | RNGRR↓SWPAG | TSGKR↓SWPAG | b KSGKR↓SVSMG | d KTTKR↓SVPQS | After dibasic residues |
| NS2B/NS3 | b KGTQK↓AGAMW | KSGRR↓GTVLW | KHDRR↓GGVLW | RTAKR↓GGVLW | KSGRR↓GTVLW | b SATQR↓AGAMW | d ENRKR↓SNDTP | After dibasic residues |
| NS3/NS4A | b AEGRR↓GASDI | AEGRR↓SYVPI | AEGRR↓SYMPI | AEGKR↓SAVQL | AEGRR↓SYVPL | b AEGRR↓GAMDL | d AGGKR↓SAVDL | After dibasic residues |
| NS4A/2K | AEKQR↓SAIDN | PGSQR↓SVQDN | PGNQR↓SIQDN | AGGQR↓SIADN | PGSQR↓SVQDN | AEKQR↓SALDN | d EGKQR↓SMVDN | After dibasic residues |
| 2K/NS4B | b LAVTA↓NEKGL | ALIAA↓NETGL | GGIAA↓NEMGM | SLIAA↓NETGL | ALIAA↓NETGL | b LMIAA↓NEKGL | d GAVAA↓NEYGM | Signalase-like cleavage |
| NS4B/NS5 | b KSARK↓GTPGG | GVPRR↓GVTIS | QPSRR↓GKKVE | TTPRR↓GRRVN | GVPRR↓GMTIC | b KSARR↓GTPGG | f VVTRK↓GTAGG | After dibasic residues |
a Genbank Accession numbers for the sequences used in this analysis are listed in Table 5; b Consistent with the BJV and NHUV polyprotein cleavage sites proposed by [79]; c 3 residues downstream of the NHUV virion C/anch C cleavage sites proposed by [79]; d Consistent with the NOUV polyprotein cleavage sites proposed by [100]; e 15 residues upstream of the NOUV NS1/NS2A cleavage site proposed by [100]; f 30 residues upstream of the NOUV NS4B/NS5 cleavage site proposed by [100].