Y Hashimoto1, P O Lawrence. 1. Department of Entomology and Nematology, University of Florida, Gainesville, FL 32611-0620, USA.
Abstract
The Diachasmimorpha longicaudata entomopoxvirus (DlEPV) is the first symbiotic EPV described from a parasitic wasp. The DlEPV is introduced into the tephritid fruit fly larval host along with the wasp egg at oviposition. We sequenced a shotgun genomic library of the DlEPV DNA and analyzed and compared the predicted protein sequences of eight ORFs with those of selected poxviruses and other organisms. BlastP searches showed that five of these are homologous to poxvirus putative proteins such as metalloprotease, a putative membrane protein, late transcription factor-3, virion surface protein, and poly (A) polymerase (PAP) regulatory small subunit. Three of these are similar to those of other organisms such as the gamma-glutamyltransferase (GGT) of Arabidopsis thaliana, eukaryotic initiation factor 4A (eIF4A) of Caenorhabditis briggsae and lambda phage integrase (lambda-Int) of Enterococcus faecium. Transcription motifs for early (TGA,A/T,XXXXA) or late (TAAATG, TAAT, or TAAAT) gene expression conserved in poxviruses were identified with those ORFs. Phylogenetic analysis of multiple alignments of five ORFs and 20 poxvirus homologous sequences and of a concatenate of multiple alignments suggested that DlEPV probably diverged from the ancestral node between the fowlpox virus and the genus B, lepidopteran and orthopteran EPVs, to which Amsacta moorei and Melanoplus sanguinipes EPV, respectively, belong. The DlEPV putative GGT, eIF4A, and lambda-Int contained many conserved domains that typified these proteins. These homologues may be involved in either viral pathogenicity or enhancing parasitism via the gamma-glutamyl cycle and compensation of eIF4A levels in the parasitized fly, or via the integration of a portion of the viral genome into the wasp and/or parasitized fly.
The Diachasmimorpha longicaudata entomopoxvirus (DlEPV) is the first symbiotic EPV described from a parasitic wasp. The DlEPV is introduced into the tephritid fruit fly larval host along with the wasp egg at oviposition. We sequenced a shotgun genomic library of the DlEPV DNA and analyzed and compared the predicted protein sequences of eight ORFs with those of selected poxviruses and other organisms. BlastP searches showed that five of these are homologous to poxvirus putative proteins such as metalloprotease, a putative membrane protein, late transcription factor-3, virion surface protein, and poly (A) polymerase (PAP) regulatory small subunit. Three of these are similar to those of other organisms such as the gamma-glutamyltransferase (GGT) of Arabidopsis thaliana, eukaryotic initiation factor 4A (eIF4A) of Caenorhabditis briggsae and lambda phage integrase (lambda-Int) of Enterococcus faecium. Transcription motifs for early (TGA,A/T,XXXXA) or late (TAAATG, TAAT, or TAAAT) gene expression conserved in poxviruses were identified with those ORFs. Phylogenetic analysis of multiple alignments of five ORFs and 20 poxvirus homologous sequences and of a concatenate of multiple alignments suggested that DlEPV probably diverged from the ancestral node between the fowlpox virus and the genus B, lepidopteran and orthopteran EPVs, to which Amsacta moorei and Melanoplus sanguinipes EPV, respectively, belong. The DlEPV putative GGT, eIF4A, and lambda-Int contained many conserved domains that typified these proteins. These homologues may be involved in either viral pathogenicity or enhancing parasitism via the gamma-glutamyl cycle and compensation of eIF4A levels in the parasitized fly, or via the integration of a portion of the viral genome into the wasp and/or parasitized fly.
The Poxviridae has an extremely wide host range in both vertebrates and invertebrates and possesses a relatively large dsDNA genome (130–300 kb) (Moss, 2001). The family consists of two subfamilies, Chordopoxvirinae (ChPV) or vertebrate poxviruses and Entomopoxvirinae (EPV) or insect poxviruses. Eight genera of ChPV and three genera (EPV A, B, and C, hereafter group A, B, and C) of EPV, have been established based on virus structure and host range (Esposito et al., 2000). Group A [type species, Melolontha melolontha EPV (MmEPV)], has an ovoid virion and infects coleopteran insects. Group B [type species, Amsacta moorei EPV (AmEPV)], also has an ovoid virion but infects lepidopteran and orthopteran insects. Group C [type species, Chironomus luridus EPV (ClEPV)], has a brick-shaped virion and infects dipteran insects (Esposito et al., 2000). Both the MsEPV and AmEPV genomes have been completely sequenced (Afonso et al., 1999; Bawden et al., 2000) and the analysis of their genome organization and gene composition demonstrated a lack of significant co-linearity between them, suggesting an inadequate classification of group B entomopoxviruses (Afonso et al., 1999; Bawden et al., 2000). The sequence and phylogenetic relationships among single genes of several EPVs in groups A and B have been reported. These include spheroidin, a major component of the large crystalline occlusion body that is characteristic of EPVs and absent from ChPVs (Hall and Moyer, 1991; Sanz et al., 1994; Li et al., 1997; Sriskantha et al., 1997; Mitsuhashi et al., 1998; Hernandez–Crespo et al., 2000); fusolin, a spindle protein found primarily in lepidopteran EPVs (Dall et al., 1993; Hayakawa et al., 1996; Mitsuhashi et al., 1997); DNA polymerase (Mustafa and Yuen, 1991; Gauthier et al., 1995); poly(A) polymerase (PAP) regulatory small subunit, 30k virion protein and a 17k ORF (Sriskantha et al.,1997; Crnov and Dall, 1999); thymidine kinase (Lytvyn et al., 1992); nucleoside triphosphate phosphohydrolase I (NPH I) (Yuen et al., 1991; Sriskantha et al., 1997; Li et al., 1998); and the rifampicin resistance protein (Osborne et al., 1996). No genes have been sequenced from group C EPVs except for those reported previously by our laboratory for Diachasmimorpha longicaudata EPV (DlEPV). DlEPV homologues of poxvirus genes that encode putative proteins include DNA helicase, a Drosophila related oligoribonuclease, transcription factor, and other DlEPV-specific sequences (Mwaengo and Lawrence, 2003), DNA dependent RNA polymerase (Lawrence, 2002), and rifampicin-resistance protein ().DlEPV was originally found in the poison gland apparatus of the female braconid wasp, Diachasmimorpha longicaudata, which parasitizes larvae of the Caribbean fruit fly, Anastrepha suspensa and other tephritid fruit flies (Lawrence and Akin, 1990). DlEPV is injected into larval hosts along with wasp eggs. The virus then invades and replicates in the hemocyte cytoplasm from which progeny bud into the hemolymph to infect other hemocytes (Lawrence, 2002). Infected hemocytes lose their adhesive property, and undergo apoptosis, and are unable to encapsulate the wasp eggs (Lawrence, 2005). These pathologies do not occur within the wasp where the virus is confined to the cuticle-lined poison gland (Lawrence and Akin, 1990; Lawrence, 2002). Analogous to other endoparasitoid viruses, DlEPV is considered to be a symbiotic agent of the parasitic wasp, because it disrupts normal hemocyte function in the host that results in host immunosuppression and parasite survival (Lawrence, 2005). DlEPV is unique in EPVs in that it does not produce an occlusion body in infected host hemocytes (Lawrence, 2002). Therefore we consider DlEPV to belong to group C of the subfamily Entomopoxvirinae or to a new group that reflects its unique biological property as a symbiotic EPV. Studies are in progress to determine how DlEPV progeny is acquired by neonate wasps or is maintained throughout wasp development.Many endoparasitic wasps studied to date, particularly those attacking lepidopterans, possess virus(es) that cause immunosuppression in their hosts (Lavine and Beckage, 1996). The best-characterized viral symbionts of parasitic wasps are the polydnaviruses (PDVs) that possess segmented DNA genomes and are transmitted vertically to wasp offspring through proviral integration in the wasp genome (Fleming and Summers, 1991; see also a review by Kroemer and Webb, 2004). Unlike DlEPV, PDVs do not replicate in their hosts (Webb, 1998; Whitfield and Asgari, 2003). However, some PDV messages are expressed in various host tissues (Le et al., 2003). Several putative genes have been characterized and proposed as part of the PDV immunosuppression mechanism. These include cys-motif, rep, vinnexin, vankyrin gene in the Ichnovirinae subfamily and egf, ptp in the Brachovirinae subfamily (Kroemer and Webb, 2004).An increasing number of symbiotic non-poly DNA viruses of parasitic wasps have been described to date (see other papers in this issue). These include the DNA viruses Diadromus pulchellus ascovirus-4 (DpAV4) (Bigot et al., 1997) and a putative baculovirus from wasps of coleopterans (Krell and Stoltz, 1979; Stoltz, 1981). RNA viruses including reo viruses (Bigot et al., 1995; Stoltz and Makkay, 2000), a cypovirus of D. pulchellus (RV-2) (Renault et al., 2003), and a picorna-like virus have also been discovered from wasps of lepidopterans (see papers in this issue). DlEPV and DlRhV, a rhabdovirus from D. longicaudata (Lawrence and Akin, 1990), coronavirus-like and reovirus-like particles from Psytalia (Opius) concolor (Jacas et al., 1997), and virus-like particles (VLPs) from several Leptopilina sp. parasites of Drosophila sp. (Rizki and Rizki, 1990; Dupas et al., 1996; Morales et al., 2005) have been described in dipterans. Unlike PDVs, these latter viruses apparently replicate in their dipteran hosts but very few of their genes have been sequenced and their methods of transgenerational transmission are unknown.The goal of this report is to analyze shotgun sequence data of DlEPV DNA in order to identify DlEPV homologues of poxvirus genes and to determine their phylogenetic relationship to those of EPVs and other organisms. We found five poxvirus conserved genes including metalloprotease (homologue of vaccinia virus [VV] G1L), a putative membrane protein (VV F9L homologue), late transcription factor-3 (VV A2L homologue), virion surface protein (VVA28L homologue), poly (A) polymerase (PAP) regulatory small subunit (VV J3R homologue), and three putative genes not found in poxviruses; gamma-glutamyltransferase (GGT), putative eukaryotic initiation factor 4A (eIF4A) and lambda phage integrase (λ-Int).GGT has been found in many organisms other than viruses and functions in amino acid transport and the metabolism of glutathione, an antioxidant and detoxifying agent (Ikeda et al., 1995a). eIF4A is a component of eIF4F, which mediates the binding of mRNA to ribosomes (Svitkin et al., 2001). No eIF4A homologue has been found in viruses. λ-Int induces the insertion of bacteriophage DNA into host bacterial chromosomal DNA through site-specific recombination (Campbell, 1992; Nunes–Düby et al., 1998).To examine the gene structure and relative phylogenetic position of DlEPV within the Poxviridae, we performed multiple alignments of the protein sequences and constructed phylogenetic trees of each of five DlEPV ORFs with their homologues from 20 selected poxviruses whose genomes have been completely sequenced. Conserved domain (CD) analysis was performed with three DlEPV ORFs not found in poxviruses, to evaluate a possible role of their putative gene products within the context of the DlEPV-wasp-dipteran host interaction. The data indicate that DlEPV is undoubtedly a member of the Poxviridae but this virus also contains non-poxvirus genes, suggesting its possible acquisition of genes from its dipteran host and/or host-related microflora, that probably reflect an earlier evolutionary relationship with dipterans. The phylogenetic relationships identified between DlEPV and other EPVs suggest that DlEPV is different from the group B EPVs and should be classified as a group C EPV since it is pathogenic to its dipteran host and not to its hymenopteran vector.
Materials and methods
Insects and source of female accessory gland
A laboratory colony of D. longicaudata was maintained as previously described (Lawrence et al., 1976; Lawrence, 1988). The accessory glands (gland reservoir and filaments) from 5 to 10-day-old female wasps were collected into TE buffer (10 mM Tris–HCl, 1 mM EDTA, pH8.0), then frozen at −80 °C until used as previously described (Lawrence, 2002).
Shotgun sequencing of DlEPV DNA
DlEPV genomic DNA (250–300 kb) was isolated from wasp glands by pulse field gel electrophoresis as previously described (Lawrence, 2002) and gel plugs containing DNA were processed for construction of a shotgun genomic library at the Interdisciplinary Center for Biotechnology Research (ICBR) at the University of Florida (UF) according to their established protocol. Briefly, the gel plugs were melted and about 2.5–5 μg of the released DNA was sheared (sonicated) and the resulting fragments were size fractionated by agarose gel electrophoresis and gel filtration to yield fragments of 1–3 kb. Fragments were blunt ended, then cloned into pUC19 and the ligated plasmid DNA was used to transform DH10B competent cells. The transformants were screened by LB agar plate containing 25 μg/ml ampicillin, oxacilin (25 μg/ml), and 800 mg/plate each IPTG and X-gal (2%). White colonies were isolated and grown overnight in Terrific broth medium containing ampicillin. Two hundred to 500 ng of plasmid DNA were prepared from the culture by the BioRobot 9600 (QIAgen) and used as template for labeling with ABI Prism BigDye Terminator Cycle Sequencing Ready Reaction kit [Perkin-Elmer Applied Biosystems (ABI), Foster City, CA]. Cycle sequencing was carried out using a PTC-200 DNA Engine (MJ Research, Watertown, MA) (25 cycles; 1 °C/s to 96 °C; 96 °C for 10 s; 1 °C/s to 60 °C; 60 °C for 4 min). Dye terminator was removed by Sephadex G-50 superfine column (Sigma, St. Louis, MO) in water. The reactions were electrophoresed on an ABI 377 sequencer and the chromatograms were edited with Analysis version 1.2.1 (ABI) and assembled into consensus contiguous DNA sequences (contigs) using a package of Phred/Phrap/Consed software (Ewing and Green, 1998; Ewing et al., 1998; Gordon et al., 1998).
Analysis of DlEPV DNA sequences
Contigs larger than 1kb were analyzed for ORFs using a web-based software (ExPASy Molecular Biology Server, http://us.expasy.org/). The DNA sequences were subjected to BlastX analysis and homologues of the resulting translated protein sequences were identified in the NCBI (http://www.ncbi.nlm.nih.gov/) protein and poxvirus genome database (http://www.poxvirus.org/) with a threshold E-value <0.01.Protein sequences were analyzed by PROSITE (Hofmann et al., 1999), CD-search (RPSBLAST; Marchler-Bauer et al., 2003), PSSM (Kelly et al., 1999), and Swiss Model (Guex and Peitsch, 1997). Poxvirus promoters around the ORFs were also identified.The alignments of multiple protein sequences were performed by ClustalX (Thompson et al., 1997) using the default program parameters. Concatenated sequences of several genes have been used to more accurately describe the phylogenetic relationship among proteins from Chordopoxviruses (Gubser et al., 2004). Therefore, to generate concatenated multiple alignments, we retrieved individual sequences with their respective gaps from a FASTA file of the pre-aligned proteins, then sequentially combined them to create a new FASTA file. Phylogenetic analysis of the alignment was performed by the Bootstrap neighbour joining (NJ) Tree in the ClustalX package using random number generator seeds at 111 and 1000 bootstrap trials. Drawing and manipulation of phylogenetic trees were done with TreeView 1.5 (Page, 1996) and Adobe Illustrator 8.0.1. (Adobe Systems Inc., San Jose, CA).The GenBank accession numbers of the poxvirus DNA genome sequences used for the multi-alignments were: AmEPV ; Camelpox virus isolate M-96 (CMPV-M96), ; Cowpox virus strain Brighton Red (CPXV-BR), ; Ectromelia virus Moscow strain (ECTV-MOS), ; Fowlpox virus (FPV-FCV), ; Lumpy skin disease virus strain Meethling isolate 2490 (LSD-NEE), ; Molluscum contagiosum virus subtype 1 (MCU-SB1),
Monkeypox virus strain Zaire-96-I-16 (MPXV-ZRE), ; MsEPV, ; Myxoma virus strain Lausanne (MYX-LAU), ; Rabbit fibroma virus (RFB-KAS), ; Sheeppox virus TU-V02127 (SPPV-TU), ; Swinepox virus isolate 17077-99 (SWPV-NEB), ; Vaccinia virus strain Ankara (VAC-ANK), ; Vaccinia virus strain Copenhagen (VAC-COP), ; Vaccinia virus strain Tian Tan (VAC-TAN), ; Variola major virus strain Bangladesh-1975 (VAR-BSH), ; Variola major virus strain India-1967 (VAR-IND), ; Variola minor virus strain Garcia-1966 (VMN-GAR), ; and Yaba-like disease virus (YAB-YLD), .
DNA sequences
GenBank accession numbers of the DlEPV DNA sequences in this study are .
Results
Relationship of DlEPV to poxvirus putative proteins
BlastX searches identified homologues to eight DlEPV putative proteins with E-values⩽2.00E−3 (Table 1
). The homologues encoding metalloprotease, a putative membrane protein, late transcription factor-3, virion surface protein, and PAP regulatory small subunit were present in all poxvirus sequences. These homologues had ORFs of similar size to those of DlEPV and E-values of 2.00E−03 to 1.00E−127 that demonstrated high homology with the DlEPV ORFs. However, homologues of putative GGT, eIF4A, and -Int were absent from the poxvirus protein database. The evidence that the GGT and eIF4A ORFs belong to DlEPV is supported by their presence on the same sequence contig as those ORFs that had a hit to poxvirus proteins in the Blast search (Table 1). The contig (ID#137) containing -Int ORF had no poxvirus blast hit (Table 1). However, at least three other ORFs within the same contig showed homology to Plasmodium falciparum putative proteins (data not shown). This eliminates the possibility of bacterial contamination.
Table 1
Selected complete ORFs and/or contigs from shotgun sequences of DlEPV DNA that are homologous with poxvirus proteins
Contig ID (nts)
Selected ORF size (aa)
GenBank accession number
Poxvirus transcription motif
E-value
Highest BlastP hit (Gene ID, size in aa)
Highest Blast hit by partial or complete ORF(s) present in the DlEPV sequence contig (excluding the selected ORFs)
aa=length in amino acids; E, L=ORF containing early and late transcription motifs, respectively; nts=length in nucleotides; dashed line (−)=no transcription start site found.
Selected complete ORFs and/or contigs from shotgun sequences of DlEPV DNA that are homologous with poxvirus proteinsaa=length in amino acids; E, L=ORF containing early and late transcription motifs, respectively; nts=length in nucleotides; dashed line (−)=no transcription start site found.Either TAAATG, TAAT, or TAAAT, poxvirus late transcription motifs (Rosel et al., 1986), was present in five DlEPV ORFs (Table 1). These DlEPV putative late genes contained a region of multiple consecutive Ts separated by eight to 24 nt from the first A of the late transcription motif (data not shown), suggesting the presence of strong late promoters similar to those found in vaccinia (Davison and Moss, 1989). The early poxvirus promoter, TGAAANNNNA, that occurs upstream of several EPV early genes (Afonso et al., 1999; Bawden et al., 2000; Gruidl et al., 1992; Lytvyn et al., 1992; Mustafa and Yuen, 1991; Li et al., 1997) was also present upstream of the ORFs for the DlEPV eIF4 and PAP regulatory small subunit homologues (data not shown). No poxvirus transcription motifs were found within 100 nt upstream of the translation initiation codon of the ORF of the putative GGT.Five DlEPV ORFs contained CDs homologous to those of EPVs and ChPVs (Table 1) (Marchler-Bauer et al., 2003). The DlEPV ORF encoding the PAP regulatory small subunit possesses the PARP regulatory domain (NCBI, CDD entry # pfam01358; 294 aa) (E-value=1e−47, 93.9% alignment to the consensus sequence of the domain, hereafter referenced as “aligned”) (Fig. 1a
). Interestingly, the DlEPV putative metalloprotease possesses the peptidase M44 domain (NCBI, CCD entry # pfam03410; 590 aa) (E-value=9e−5, 28.6% aligned) (Fig. 1b), which is lacking in the AmEPV and MsEPV metalloproteases (AmEPV ORF256, NP065038 and MsEPV ORF056 AAC97620, respectively). The ORF encoding a putative membrane protein possesses the L1L F9 C19 domain (NCBI, CCD entry # pfam02442; 224aa) (E-value=2e−17, 69.6% aligned) (Fig. 1c). The ORF encoding late transcription factor-3 homologue possesses the poxvirus late transcription factor domain (NCBI, CCD entry # pfam04947; 218 aa) (E-value=1e−8, 91.7% aligned) (Fig. 1d) and the DlEPV virion surface protein homologue possesses the poxvirus A28 family domain (E-value=4e−22, 97.1% aligned) (Fig. 1e). These results reveal that the five DlEPV ORFs contain the conserved motifs common in most homologous genes of poxviruses.
Fig. 1
Alignment of amino acid sequences of eight DlEPV ORFs with those of the CD or signature sequences of CDs of homologues in the NCBI CD database (http://www.ncbi.nlm.nih.gov;80/Structure/cdd/cdd.shtml). Red=identical; blue=similar; -=gap. (a) DlEPV poly (A) polymerase regulatory subunit (PAP) and the protein family (pfam), pfam01358 (Poly (A) polymerase regulatory subunit) domain. CD-Length=294 residues, 93.9% aligned, E value=1e−47. (b) DlEPV metalloprotease homologue (DlEPV metall) and pfam03410 (Peptidase M44 or Protein G1 that is a glycoprotein expressed by many Poxviridae) domain. CD-Length=590 residues, 28.6% aligned, E value=9e–05. (c) DlEPV membrane protein homologue (DlEPV memb) and pfam02442 (poxvirus unknown orf family) domain. CD-Length=224 residues, 69.6% aligned, E value=2e–17. (d) DlEPV late transcription factor homologue (DlEPV LTF) and pfam04947 (Poxvirus Late Transcription Factor VLTF3 like) domain. CD-Length=218 residues, 91.7% aligned, E value=1e−08. (e) DlEPV virion surface protein homologue (DlEPV VSP) and pfam04584 (Poxvirus A28 family) domain. CD-Length=140 residues, 97.1% aligned, E value=4e−22. (f) DlEPV gamma-glutamyltransferase homologue (DlEPV GGT) and pfam01019 (Gamma-glutamyltranspeptidase) domain. CD-Length=504 residues, 83.9% aligned, E value=7e−42. (g) Upper alignment: DlEPV GGT and a signature pattern of CD of gamma-glutamyltranspeptidase, PDOC00404. AA residues in [ ] represent possible alternative residues. Lower alignment: GGT protein sequences used for alignment are human GGT (), Drosophila melanogaster GGT (), and Vibrio parahaemolyticus GGT (). The regions used for alignment are indicated in parenthesis. (h) DlEPV -integrase homologue (DlEPV ) and pfam00589 (Phage integrase) domain. CD-Length=175 residues, 98.3% aligned, E value=1e−09. DlEPV -integrase homologue (DlEPV ) and COG4974 (Site-specific recombinase XerD) domain. CD-Length=300 residues, 77.7% aligned, E value=5e−09. (i) DlEPV eIF4A homologue and cd00079 (Helicase superfamily c-terminal) domain. CD-Length=131 residues, 99.2% aligned, E value=1e−30. DlEPV eIF4A homologue and cd00268 (DEAD-box helicases) domain. CD-Length=203 residues, 100.0% aligned, E value=1e−59.
Alignment of amino acid sequences of eight DlEPV ORFs with those of the CD or signature sequences of CDs of homologues in the NCBI CD database (http://www.ncbi.nlm.nih.gov;80/Structure/cdd/cdd.shtml). Red=identical; blue=similar; -=gap. (a) DlEPV poly (A) polymerase regulatory subunit (PAP) and the protein family (pfam), pfam01358 (Poly (A) polymerase regulatory subunit) domain. CD-Length=294 residues, 93.9% aligned, E value=1e−47. (b) DlEPV metalloprotease homologue (DlEPV metall) and pfam03410 (Peptidase M44 or Protein G1 that is a glycoprotein expressed by many Poxviridae) domain. CD-Length=590 residues, 28.6% aligned, E value=9e–05. (c) DlEPV membrane protein homologue (DlEPV memb) and pfam02442 (poxvirus unknown orf family) domain. CD-Length=224 residues, 69.6% aligned, E value=2e–17. (d) DlEPV late transcription factor homologue (DlEPV LTF) and pfam04947 (Poxvirus Late Transcription Factor VLTF3 like) domain. CD-Length=218 residues, 91.7% aligned, E value=1e−08. (e) DlEPV virion surface protein homologue (DlEPV VSP) and pfam04584 (Poxvirus A28 family) domain. CD-Length=140 residues, 97.1% aligned, E value=4e−22. (f) DlEPV gamma-glutamyltransferase homologue (DlEPVGGT) and pfam01019 (Gamma-glutamyltranspeptidase) domain. CD-Length=504 residues, 83.9% aligned, E value=7e−42. (g) Upper alignment: DlEPVGGT and a signature pattern of CD of gamma-glutamyltranspeptidase, PDOC00404. AA residues in [ ] represent possible alternative residues. Lower alignment: GGT protein sequences used for alignment are humanGGT (), Drosophila melanogaster GGT (), and Vibrio parahaemolyticus GGT (). The regions used for alignment are indicated in parenthesis. (h) DlEPV -integrase homologue (DlEPV ) and pfam00589 (Phage integrase) domain. CD-Length=175 residues, 98.3% aligned, E value=1e−09. DlEPV -integrase homologue (DlEPV ) and COG4974 (Site-specific recombinase XerD) domain. CD-Length=300 residues, 77.7% aligned, E value=5e−09. (i) DlEPVeIF4A homologue and cd00079 (Helicase superfamily c-terminal) domain. CD-Length=131 residues, 99.2% aligned, E value=1e−30. DlEPVeIF4A homologue and cd00268 (DEAD-box helicases) domain. CD-Length=203 residues, 100.0% aligned, E value=1e−59.
Phylogenetic relationship of five DlEPV genes to those of poxviruses
Phylogenetic trees were reconstructed from multiple alignments of five DlEPV putative proteins and 20–21 poxvirus homologues consisting of 2–3 EPVs and 18 ChPVs (Figs. 2a–c
). The unrooted NJ trees of the 21 aligned poxvirus sequences for virion surface protein, metalloprotease, putative membrane protein and late transcription factor-3 were similar (data not shown). Therefore the tree for virion surface protein (Fig. 2a) is used as a representative to demonstrate this relationship. The phylogenetic trees of the virion surface protein homologue (Fig. 2a), PAP regulatory small subunit (Fig. 2b), and of the concatenated multiple alignments of all five DlEPV poxvirus-related sequences (Fig. 2c), all corroborate established reports that fowlpox virus occurs in a clade that is the most distant of the non-orthopoxviruses (Fig. 2a). With the exception of the PAP sequences (Fig. 2b), the poxvirus homologues, suggest that DlEPV arose between fowlpox virus and the divergence of the group B EPVs (Figs. 2a and c). In contrast, the phylogenetic tree for the PAP regulatory small subunit showed a slight difference in the relationship of DlEPV to the group B EPVs (Fig. 2b). AmEPV possesses two PAP regulatory small subunit homologues, AmEPV ORF115 and ORF060 (NP 064897 and NP 064842, respectively). The former is closely related to MsEPV PAP regulatory small subunit (MsEPV ORF041, NP 048112) and the latter is closely related to Heliothis armigera EPV (HaEPV) PAP regulatory small subunit (Crnov and Dall, 1999). However the low bootstrap value (49) for the MsEPV ORF041 and AmEPV ORF115 compared to the value (100) for the AmEPV ORF060 and HaEPV (Fig. 2b) along with the value (67) for DlEPV, may indicate that DlEPV PAP regulatory small subunit is more closely related to the AmEPV ORF060/HaEPV group than to the AmEPV ORF060/MsEPV ORF041 group (Fig. 2b). In any event, even the tree derived from the concatenated alignments of the five DlEPV ORFs (including the PAP sequences), confirms the result from the four individual phylogenetic trees (represented in Fig. 2a), that DlEPV is positioned between fowlpox virus and the group B EPVs (Fig. 2c).
Fig. 2
Unrooted NJ trees of poxvirus virion surface proteins (a), PAP regulatory small subunit (b), and concatenated multiple alignments of virion surface protein, PAP catalytic subunit, metalloprotease, membrane protein and late transcription factor (c). Bootstrap values are shown for each branch. AmEPV=Amsacta moorei entomopoxvirus; DlEPV=Diachasmimorpha longicaudata entomopoxvirus; FPV-FCV= Fowlpox virus; HaEPV=Heliothis armigera entomopoxvirus; LSD-NEE=Lumpy skin disease virus strain Meethling isolate 2490; MCU-SB1=Molluscum contagiosum virus subtype 1; MsEPV=Melanoplus sanguinipes entomopoxvirus; MYX-LAU=Myxoma virus strain Lausanne; RFB-KAS=Rabbit fibroma virus; SPPV-TU =Sheeppox virus TU-V02127; SWPV-NEB=Swinepox virus isolate 17077-99; and YMTV-YLD=Yaba-like disease virus. See Materials and Methods for GenBank accession numbers. The scale bar refers to a phylogenetic distance of 0.1 amino acid substitutions per site.
Unrooted NJ trees of poxvirus virion surface proteins (a), PAP regulatory small subunit (b), and concatenated multiple alignments of virion surface protein, PAP catalytic subunit, metalloprotease, membrane protein and late transcription factor (c). Bootstrap values are shown for each branch. AmEPV=Amsacta moorei entomopoxvirus; DlEPV=Diachasmimorpha longicaudata entomopoxvirus; FPV-FCV= Fowlpox virus; HaEPV=Heliothis armigera entomopoxvirus; LSD-NEE=Lumpy skin disease virus strain Meethling isolate 2490; MCU-SB1=Molluscum contagiosum virus subtype 1; MsEPV=Melanoplus sanguinipes entomopoxvirus; MYX-LAU=Myxoma virus strain Lausanne; RFB-KAS=Rabbit fibroma virus; SPPV-TU =Sheeppox virus TU-V02127; SWPV-NEB=Swinepox virus isolate 17077-99; and YMTV-YLD=Yaba-like disease virus. See Materials and Methods for GenBank accession numbers. The scale bar refers to a phylogenetic distance of 0.1 amino acid substitutions per site.
Analysis of conserved domains of DlEPV homologues of GGT, eIF4A and λ-Int
The DlEPV ORF encoding a putative GGT homologue, had 83.9% (E-value=7e−42) amino acid (aa) alignment to a gamma-glutamyltranspeptidase domain (NCBI, CCD entry # pfam01019, 504 aa) (Fig. 1f). However, the DlEPVGGT possesses only five of the 15 conserved amino acids in the signature sequence of the GGT (Fig. 1g). Amino acid substitution analysis of humanGGT revealed that Asp422 and Asp423 (Aspartic acid at positions #422 and 423, respectively) and Ser451 or Ser452 (serine at positions 451 or 452, respectively) are crucial for enzyme activity (Ikeda et al., 1995a, Ikeda et al., 1995b). In a multiple alignment of the DlEPVGGT homologue and 12 GGT sequences from a variety of organisms such as Homo sapiens, D. melanogaster, and Vibrio parahaemolyticus, only the Ser residues were found in corresponding positions, but Asp-422 and -423 in humanGGT were instead replaced by Asn and Ser, respectively, in the DlEPVGGT homologue (Fig. 1g).A CD analysis of the DlEPV -Int ORF indicated the presence of a phage integrase domain (pfam00589, 175 aa) (E-value=1e−9, 98.3% aligned) and a site-specific recombinase XerD (COG4974, 300 aa) (E-value=5e−9, 77.7% aligned) (Fig. 1h). The -Int is a prototype of a diverse family of tyrosine recombinases that rearranges DNA duplexes by means of conservative site-specific recombination reactions (Campbell, 1992). The -Int protein has four highly conserved residues, a tetrad, that includes tyrosine in an active site, which is directly involved in transesterification (Arg212, His308-XX-Arg311, Tyr343) (Argos et al., 1986; Abremski and Hoess, 1992). -Int possesses three distinct structural domains: an N-terminal domain (aa residues 1–64) responsible for binding to distal arm sites on the phage DNA (Moitoso de Vargas et al., 1988), a central domain (aa residues 65–169) required for high-affinity binding to core-type sites), and the C-terminal catalytic domain (aa residues 170–356) (Tirumalai et al., 1998). The catalytic domain is known to be competent in specifically recognizing and cleaving its cognate core DNA substrates, either as full- or half sizes (Tirumalai et al., 1997). When a 3D structure of the DlEPV -Int homologue [predicted by 3D-pssm (Fischer et al., 1999)] is manually aligned with the conserved tetrads in the catalytic domains of -Int (PDB entry #, 1AE9A) and Cre (PDB entry #, 1CRXA) tyrosine recombinases, the same tetrad is conserved in DlEPV (Fig. 3
). The DlEPV -Int tetrads align closely with those of -Int and Cre recombinases, with the Arg (R)–Arg (R) spacing (78 aa) and Arg (R)–Tyr (Y) spacing (32 aa) of DlEPV integrase homologue falling within the range found in other tyrosine recombinases (Esposito and Scocca, 1997) (Fig. 3). Additionally, the secondary structure of the DlEPV λ-Int homologue consists of a similar number of helices and overall interhelical spacing as do the -Int and Cre recombinases (Fig. 3).
Fig. 3
Sequence alignment of the catalytic domain within the putative DlEPV -Int (302 aa long) with the bacteriophage homologue (1AE9:A, P03700, 170–357 aa) and Cre recombinase (1CRX:A, P06956, 127–339 aa). Secondary structures of the proteins (1AE9:A and 1CRX:A) were retrieved from the Protein Data Bank (PDB) (http://www.rcsb.org/pdb/). The alignment of -Int and Cre recombinase has been reported previously (Nunes-Düby et al., 1998). The key conserved catalytic residues are represented as bold underlined letters (Subramanian et al., 2003). Lambda=Amino acid sequence of the ORF of the bacteriophage -integrase. 1AE9:A=C-terminal catalytic domain of bacteriophage -integrase. Cre=Amino acid sequence of the ORF of the Cre-recombinase. 1CRX:A=C-terminal catalytic domain of Cre-recombinase. DlEPV=Amino acid sequence of the ORF of DlEPV -integrase homologue. GOR4 and 3D-pssm=predicted C-terminal catalytic domain of DlEPV λ-integrase, generated by GOR4 (Garnier et al., 1996) and 3D-pssm (Kelley et al., 2000). Abbreviations for a predicted secondary structure of the query proteins are; b, beta bridge; c, random coil; e, extended strand; g, 310 helix; h, alpha helix; s, bend region; and t, beta turn.
Sequence alignment of the catalytic domain within the putative DlEPV -Int (302 aa long) with the bacteriophage homologue (1AE9:A, P03700, 170–357 aa) and Cre recombinase (1CRX:A, P06956, 127–339 aa). Secondary structures of the proteins (1AE9:A and 1CRX:A) were retrieved from the Protein Data Bank (PDB) (http://www.rcsb.org/pdb/). The alignment of -Int and Cre recombinase has been reported previously (Nunes-Düby et al., 1998). The key conserved catalytic residues are represented as bold underlined letters (Subramanian et al., 2003). Lambda=Amino acid sequence of the ORF of the bacteriophage -integrase. 1AE9:A=C-terminal catalytic domain of bacteriophage -integrase. Cre=Amino acid sequence of the ORF of the Cre-recombinase. 1CRX:A=C-terminal catalytic domain of Cre-recombinase. DlEPV=Amino acid sequence of the ORF of DlEPV -integrase homologue. GOR4 and 3D-pssm=predicted C-terminal catalytic domain of DlEPV λ-integrase, generated by GOR4 (Garnier et al., 1996) and 3D-pssm (Kelley et al., 2000). Abbreviations for a predicted secondary structure of the query proteins are; b, beta bridge; c, random coil; e, extended strand; g, 310 helix; h, alpha helix; s, bend region; and t, beta turn.Several CDs were identified among the eIF4A homologues in the Blast search, including the helicase super family C-terminal domain (NCBI, CDD entry # cd00079, 322 aa) (E-value=1e−30, 99.2% aligned) and the DEAD-box helicase domain (NCBI, CCD entry # cd00268, 203 aa) (E-value=1e−59, 100.0% aligned) (Fig. 1i). eIF4A together with eIF4E and eIF4G consist of a eukaryotic initiation factor 4F that is involved in the recognition of the mRNA cap structure, ATP-dependent unwinding of the 5′-terminal secondary structures (helicases) and recruitment of the mRNA to the ribosome (Svitkin et al., 2001). The eIF4A is a prototype helicase of the DEAD box helicase family that consists of eight motifs (I, Ia, Ib, II, III, IV, V, and VI) (Tanner and Linder, 2001). The DlEPVeIF4A homologue has a single amino acid substitution in motif Ib, where the last amino acid is altered from Ile (I) to Met (M), and five amino acid substitutions in motif V (Fig. 4
). The distances between the N-terminus and motif I (49 aa) of DlEPV and between motif VI and the C-terminus (39 aa) along with the overall arrangement of the motifs, are very similar to those of DEAD box helicases (Nunes-Düby et al., 1998). Interestingly, the DlEPV sequences in all motifs are almost identical (except for six residues) with those of Drosophila melanogaster (Fig. 4), a cyclorraphous dipteran like the A. suspensa host within which DlEPV is pathogenic (Lawrence, 2002).
Fig. 4
Alignments of the conserved motif of DEAD helicases with the eIF4A ORFs of DlEPV (379 aa) and Drosophila melanogaster (D.m.) (403 aa). In the DEAD motif, amino acids conserved at least 80% of the time are shown as capital letters while those conserved 50%-79% of the time are in lower case (Tanner and Linder, 2001). DEAD helicases have 8 motifs (boxed) that are separated into two domains by motif III. Non-conserved amino acid residues between DlEPV and D.m. are underlined. Numbers of amino acid residues between motifs and at both termini are shown in parenthesis.
Alignments of the conserved motif of DEAD helicases with the eIF4A ORFs of DlEPV (379 aa) and Drosophila melanogaster (D.m.) (403 aa). In the DEAD motif, amino acids conserved at least 80% of the time are shown as capital letters while those conserved 50%-79% of the time are in lower case (Tanner and Linder, 2001). DEAD helicases have 8 motifs (boxed) that are separated into two domains by motif III. Non-conserved amino acid residues between DlEPV and D.m. are underlined. Numbers of amino acid residues between motifs and at both termini are shown in parenthesis.When the sequence of DlEPVeIF4A is inferred with 3D-pssm (Kelley et al., 2000), there was a strong similarity in structural profile to that of yeasteIF4A (E-value, 0.00196). Furthermore, the Swiss Model for comparative protein modeling (Guex and Peitsch, 1997) indicated that a predicted 3D model of DlEPVeIF4A consists of two globular domains that are connected by the motif III (data not shown), which is typical of 3D structures of DEAD box helicases.
Discussion
In this paper, we describe five DlEPV ORFs that encode putative proteins that are homologous to those conserved in poxviruses, including two homologues of the vaccinia virus (VV) putative membrane protein (VV F9L homologue) and virion surface protein (VVA28L homologue), two transcription/RNA modification proteins, late transcription factor 3 (VV A2L homologue), PAP regulatory small subunit (VV J3R homologue), and one viral enzyme, metalloprotease (VVG1L homologue). These sequences enrich previous DlEPV sequence data consisting of ORFs similar to DNA dependent RNA polymerase (VV J6R homologue) (Lawrence, 2002) and DNA helicase (Mwaengo and Lawrence, 2003) and an ORF similar to rifampicin resistant protein (AF159588, VVD13L homologue).Phylogenetic analysis of single and concatenated multiple alignments of each of the 20–21 poxvirus homologues suggested that DlEPV diverged from an ancestor before AmEPV and MsEPV (Fig. 2). This pattern was consistent in four of five phylogenetic trees of single protein alignments and one from concatenated alignments (Fig. 2). Three DlEPV ORFs similar to GGT, -Int and eIF4A are not found in poxviruses including EPVs. However, these ORFs retain poxviral transcription motifs and CDs and/or secondary–tertiary structures specific to each of the proteins. These findings and the characteristics of the DlEPV genes presented here, along with earlier reports of DlEPV phylogenetic relationship with other EPVs (Lawrence, 2002; Mwaengo and Lawrence, 2003), strongly support our hypothesis that DlEPV is a member of the Entomopoxvirinae, but is distinct from the group B EPVs of AmEPV and MsEPV.Interestingly, DlEPV showed a closer relationship to fowlpox virus than to other ChPVs in all phylogenetic trees (Fig. 2). Similar phylogenetic relationships were reported with other EPVs when homologues of NPH 1 and vaccinia virus VP8 were analyzed (Li et al., 1998; Crnov and Dall, 1999). In other studies, phylogenetic analysis using completely sequenced poxvirus genomes revealed that AmEPV, MsEPV, and fowlpox virus have common genome characteristics in that they possess large size genomes (>230 kb) compared to other ChPVs (<230 kb) (Afonso et al., 2000). Additionally co-linearity of conserved genes on the genome is significantly reduced or lost in these viruses and it has been suggested that a number of genes are acquired (McLysaght et al., 2003; Gubser et al., 2004).The presence of DlEPV genes that encode new poxvirus putative proteins, suggests that these proteins likely, play a role in the interactions of the virus with either its wasp vector, the dipteran host, or both. Since GGT, eIF4A, and -Int occur in bacteria, Drosophila, phage, and other organisms but not in poxviruses, it is likely that these genes were acquired from other organisms (probably the A. suspensa host and/or microflora) by DlEPV and used in some way, probably to facilitate aspects of viral pathogenicity and/or wasp survival. GGT has been found in many organisms except for viruses (PROSITE ID, PDOC00404). It catalyzes the transfer of the gamma-glutamyl moiety of glutathione to an acceptor that may be an amino acid, a peptide, or water (forming glutamate) (Tate and Meister, 1985). GGT plays a key role in the gamma-glutamyl cycle, a pathway for the synthesis and degradation of glutathione, which is an antioxidant and detoxifying agent (Tate and Meister, 1985). Thus, GGT could be employed to maintain a physiologically stable environment and enhance development of the wasp larva. Indeed, Nappi et al. (1995) have proposed that free radicals released from Drosophila hemocytes kill the Leptopilina sp. parasitic wasps in Drosophila larvae. Also, Li et al. (2003) reported that a major component of VLP, VLP1, from the endoparasitoid Venturia canesens, which has been reported as a non-enzymatic molecule of the hydroperoxide glutathione peroxidase, could function as an antioxidant that reduces peroxide phospholipids and cholesterols within membranes (Ursini et al., 1981; Ursini et al., 1986).The DlEPV putative -Int presumably, like other integrases, catalyses site-specific DNA recombinations via the Holliday junction structure (Azaro and Landy, 1997). The latter is recognized by poxvirus resolvases, that process concatemeric viral DNA into unit-length genomes (Azaro and Landy, 1997; Garcia and Moss, 2001).However, there are apparently no poxvirus integrases in the database and the closest homologue to DlEPV -Int was that of a bacteriophage (Fig. 3). It is unlikely that this ORF is a result of bacterial contamination of DlEPV genomic DNA because three other ORFs in the same contig are homologous to Plasmodium falciparum proteins (data not shown). This raises the possibility that DlEPV probably acquired its -Int from bacteria that occur in the host insect's hemolymph and/or gut.We can only speculate that DlEPV -Int may be used for DNA recombination in portions of the wasp's genome as reported for PDVs that insert viral segments into the genome of their wasp vector (Belle et al., 2002). The PDV genes are then expressed in the host and disrupt the host's immune response, thus protecting the developing wasp (Belle et al., 2002). However, unlike DlEPV, no PDV virion replication occurs in the host. It is not yet known whether portions of the DlEPV genome are integrated into the D. longicaudatawasp genome. However, this phenomenon has also been suggested for other wasp symbiotic viruses (see Reineke and Asgari, 2005), and could be a shared mechanism underlying what we term “accommodative symbiosis”. If such viral integration exists for viruses with non-segmented genomes, like DlEPV, then the integration could (1) prevent the wasp from recognizing the virus as non-self (hence “accommodative symbiosis”) or (2) facilitate the expression of viral genes early in the infection of the host (prior to virus morphogenesis and replication) to protect the wasp egg from encapsulation as suggested for PDVs (Webb, 1998).The homologue encoding eIF4A is present only in DlEPV but not in other viruses. Although the DlEPVeIF4A ORF contained DEAD motifs that were homologous with those of DEAD helicases, the highly significant E-values of the eIF4A hit in the Blast search indicated that the DlEPV ORF encodes a putative protein that is more closely related to eIF4A proteins than to helicases. Furthermore, this close relationship is supported by the predicted secondary/tertiary structure similarities (Fig. 4). eIF4A has been identified as an essential single copy gene in D. melanogaster (Dorn et al., 1993). An alignment of DlEPV putative eIF4A protein sequence to DEAD helicase motifs and D. melanogastereIF4A revealed a strong similarity between them (Fig. 4), and suggests a common ancestral origin of these proteins. Recent reports on the expression of eIF4A from D. melanogaster in different stages demonstrated the ubiquitous presence of two abundant eIF4A gene transcripts throughout development. However, the levels of the eIF4A protein decreased significantly (40%) only in third instar larvae (Dorn et al., 1993; Hernandez et al., 2004). The third instar is the last developmental stage of fruit fly larvae and is the stage normally parasitized by D. longicaudata females (Lawrence et al., 1976). This raises the question of whether a decrease of eIF4A may be replenished by DlEPV to strengthen the protein synthesis machinery of the host larvae, in order to support virus replication and wasp development. Further studies on the expression and function of these DlEPV genes are needed to clarify their role, if any, in these parasite-host interactions.DlEPV propagates in the parasitized fruit fly host (Lawrence, 2002). This attribute clearly separates DlEPV from the extensively studied PDVs, whose virions do not normally replicate in the parasitized hosts. Therefore, immunosuppression, hemocyte blebbing, and apoptosis induced by DlEPV infection of the parasitized host are clear evidence of virus pathogenicity (Lawrence, 2005). Given the phylogenetic distinction of DlEPV from the group B EPVs (Fig. 2), the viral replication in the adult wasp without any evidence of viral-induced pathology (Lawrence and Akin, 1990), and pathogenicity in virus-infected host hemocytes (Lawrence, 2005) concomitant with the successful development of the wasp larvae, we believe that DlEPV belongs to the group C (Diptera) EPVs or to a new EPV group that reflects its unique biology as the first symbiotic EPV that also does not express the typical EPV occlusion body (Lawrence, 2002).