| Literature DB >> 31676506 |
Brian Wey1,2, Mary Ellen Heavner1,3,4, Kameron T Wittmeyer5, Thomas Briese6, Keith R Hopper5, Shubha Govind7,2,3.
Abstract
Leptopilina heterotoma are obligate parasitoid wasps that develop in the body of their Drosophila hosts. During oviposition, female wasps introduce venom into the larval hosts' body cavity. The venom contains discrete, 300 nm-wide, mixed-strategy extracellular vesicles (MSEVs), until recently referred to as virus-like particles. While the crucial immune suppressive functions of L. heterotoma MSEVs have remained undisputed, their biotic nature and origin still remain controversial. In recent proteomics analyses of L. heterotoma MSEVs, we identified 161 proteins in three classes: conserved eukaryotic proteins, infection and immunity related proteins, and proteins without clear annotation. Here we report 246 additional proteins from the L. heterotoma MSEV proteome. An enrichment analysis of the entire proteome supports vesicular nature of these structures. Sequences for more than 90% of these proteins are present in the whole-body transcriptome. Sequencing and de novo assembly of the 460 Mb-sized L. heterotoma genome revealed 90% of MSEV proteins have coding regions within the genomic scaffolds. Altogether, these results explain the stable association of MSEVs with their wasps, and like other wasp structures, their vertical inheritance. While our results do not rule out a viral origin of MSEVs, they suggest that a similar strategy for co-opting cellular machinery for immune suppression may be shared by other wasps to gain advantage over their hosts. These results are relevant to our understanding of the evolution of figitid and related wasp species.Entities:
Keywords: Drosophila; Extracellular vesicle; Leptopilina heterotoma; VLP; endoparasitoid wasp; host-parasite; immune suppression; organelle; whole genome sequencing
Mesh:
Substances:
Year: 2020 PMID: 31676506 PMCID: PMC6945029 DOI: 10.1534/g3.119.400349
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1The superset of MSEV proteins: (A) Lh 14 MSEV proteins were annotated using BLAST2GO prior to class sorting via annotation and GO Terms. Wedges denoted as “Common,” were previously published in (Heavner ) and represent proteins found in both Lh 14 and Lh NY MSEV proteomes. New proteins analyzed in this work are in wedges labeled “Lh 14 Only.” A majority of proteins belong to Class 1. Table S1 lists 246 proteins added to the superset Lh 14 proteome. (B) and (C) Enrichment analysis of MSEV superset shows high association with exosomes and mitochondria compared to other cellular organelles according to Vesiclepedia. –log 10 (p-value) trend shown in orange for both graphs. The p-values were calculated with the Bonferroni method. (B) Percentage of MSEV genes associated with specific cellular compartments found in Vesiclepedia, relative to all MSEV genes. Of the superset proteins, 41 and 49% are associated with mitochondria and exosomes, respectively (P = 3 × 10−57; 1 × 10−53). (C) Fold-enrichment of the MSEV dataset in specific cellular compartments. Although many protein classes are present in the proteome, exosomal and mitochondrial proteins show more significant enrichments.
CDD-search results of MSEV “un-annotated” proteins in the super-set. MSEV ORFs that completed the BLAST2GO pipeline and did not return any results were run through the NCBI CDD-Search Version 3.16 (Accessed: Aug. 2018). Of 45 queries, only 9 returned hits with threshold set to 1x10−2. The ninth result came from a search with E-value threshold set to 1. Results listed are all unique, high scoring hits for each ORF that returned hits from the search
| MSEV Superset Unknowns CDD-Search Results | ||||||
|---|---|---|---|---|---|---|
| Query (in-house ID) | PSSM-ID | From | To | E-Value | Accession | Short name |
| GAJC01013214.1_14 | cl26939 | DEXDc superfamily | ||||
| GAJC01012558.1_12 | cl25496 | Herpes_BLLF1 superfamily/gp350 | ||||
| GAJC01011863.1_13 | cl07006 | RNA_polI_A34 superfamily | ||||
| GAJC01011463.1_48 | cl13702 | CD99L2 superfamily | ||||
| GAJC01010930.1_16 | cl21457 | ICL_KPHMT superfamily | ||||
| GAJC01010353.1_14 | cl27055 | MutS_III superfamily | ||||
| GAJC01009713.1_25 | cl06688 | TSGP1 superfamily | ||||
| GAJC01009493.1_4 | cl21455 | P-loop_NTPase superfamily | ||||
| GAJC01002124.1_43 | cl25751 | DUF4045 superfamily | ||||
Assembly statistics: Statistics of male, female, and combined (male plus female) Lh genomes as assessed by QUASTv4.0 and BUSCOv9.0. Percent coverage was found by mapping sequencing reads back to assembly using HISAT2. The identified BUSCOs can be found in Table S1. The QUAST program was run with parameters set for eukaryotic genomes and scaffolds. The BUSCO program was run with species set to ‘Nasonia.’ Contigs smaller than 500 bp were excluded
| ASSEMBLY STATISTICS | ||||
|---|---|---|---|---|
| Male | Female | Joint | ||
| Assembly | N50 (bp) | |||
| No. scaffolds | ||||
| Largest scaffold (bp) | ||||
| Total length (bp) | ||||
| GC% | ||||
| Coverage (%) | ||||
| BUSCOs (Insecta) | Complete | |||
| Single | ||||
| Duplicated | ||||
| Fragmented | ||||
| Missing | ||||
| n | ||||
Figure 2Analysis of K-mer coverage vs. GC count. (A) Analysis of genomic reads. 27-mers generated from the cleaned Illumina reads used to assemble the L. heterotoma genome binned by their GC count vs. multiplicity (total counts among the reads). Bins are colored by the number of distinct K-mers. Different clusters are identified as shown and described in the text. (B and C): A map of 27-mer multiplicity vs. GC content of the joint assembly of the Lh 14 genome (B) to a map from the published L. clavipes genome (Bioproject: PRJNA84205) (C).
MSEV genes found in scaffolds and predictions: Gene predictions from genome assembly scaffolds and AUGUSTUS gene predictions were searched for MSEV genes using tBLASTn. Results better than %ID >70%, E-value < 1x10−50, and query coverage > 70% were retained
| MSEV GENES FOUND IN GENOME ANALYSIS | ||||
|---|---|---|---|---|
| MSEV BLASTn scaffold results | AUGUSTUS prediction results | |||
| Found | Percentage | Found | Percentage | |
| Female | ||||
| Male | ||||
| Shared in M+F | ||||
| Joint Assembly | ||||
Figure 3Predicted gene structures verified by PCR amplification experiments (A, B). Diagrams showing primer locations and predicted gene structures of SmGTPase01 (A) and p40 (B). Black arrows indicate primer locations, light gray indicates introns, UTR regions are dark gray and labeled, exons encoding potential protein domains are labeled as shown. Cream colored regions in panel A do not have a specified domain. Diagrams were drawn using GenomeDiagram as part of the Biopython (v. 1.6) package (Pritchard ; Cock ). Each row in the panels A and B diagrams corresponds to approximately 1,000 bp. For primer sequences, see methods. (C and D) Ladder is Thermo Fisher MassRuler ladder. (C) PCR products for SmGTPase01 from male or female cDNA and gDNA. All products are 873 bp long. Male cDNA PCR was negative. (D) PCR products for p40 from male or female cDNA and gDNA. The expected band for p40 cDNA is 939 bp and for gDNA is 1,630 bp. Male cDNA PCR was negative. Sequence analysis of PCR amplification products confirmed gene prediction results.