Literature DB >> 28588571

Genome-Wide Transcriptome Analysis Reveals Extensive Alternative Splicing Events in the Protoscoleces of Echinococcus granulosus and Echinococcus multilocularis.

Shuai Liu1, Xiaosu Zhou1, Lili Hao2, Xianyu Piao1, Nan Hou1, Qijun Chen1,3.   

Abstract

Alternative splicing (AS), as one of the most important topics in the post-genomic era, has been extensively studied in numerous organisms. However, little is known about the prevalence and characteristics of AS in Echinococcus species, which can cause significant health problems to humans and domestic animals. Based on high-throughput RNA-sequencing data, we performed a genome-wide survey of AS in two major pathogens of echinococcosis-Echinococcus granulosus and Echinococcus multilocularis. Our study revealed that the prevalence and characteristics of AS in protoscoleces of the two parasites were generally consistent with each other. A total of 6,826 AS events from 3,774 E. granulosus genes and 6,644 AS events from 3,611 E. multilocularis genes were identified in protoscolex transcriptomes, indicating that 33-36% of genes were subject to AS in the two parasites. Strikingly, intron retention instead of exon skipping was the predominant type of AS in Echinococcus species. Moreover, analysis of the Kyoto Encyclopedia of Genes and Genomes pathway indicated that genes that underwent AS events were significantly enriched in multiple pathways mainly related to metabolism (e.g., purine, fatty acid, galactose, and glycerolipid metabolism), signal transduction (e.g., Jak-STAT, VEGF, Notch, and GnRH signaling pathways), and genetic information processing (e.g., RNA transport and mRNA surveillance pathways). The landscape of AS obtained in this study will not only facilitate future investigations on transcriptome complexity and AS regulation during the life cycle of Echinococcus species, but also provide an invaluable resource for future functional and evolutionary studies of AS in platyhelminth parasites.

Entities:  

Keywords:  Echinococcus granulosus; Echinococcus multilocularis; alternative splicing; nextgeneration sequencing; transcriptome

Year:  2017        PMID: 28588571      PMCID: PMC5440512          DOI: 10.3389/fmicb.2017.00929

Source DB:  PubMed          Journal:  Front Microbiol        ISSN: 1664-302X            Impact factor:   5.640


Introduction

Cystic echinococcosis and alveolar echinococcosis are potentially fatal chronic and progressive zoonoses affecting humans and domestic animals (Eckert and Deplazes, 2004; Craig and Larrieu, 2006). These diseases are caused by the metacestode larval stage of two related cestodes, Echinococcus granulosus and Echinococcus multilocularis. The life cycles of both of the parasitic cestodes require intermediate hosts (wild or domesticated ungulates, rodents, and humans are the intermediate hosts for larval parasites), and definitive hosts (domestic dogs, foxes, and other wild carnivores are the definitive hosts for adult parasites). The intermediate hosts are infected by the oral uptake of eggs, which contain oncospheres. Once the eggs are activated within the host’s digestive system, the oncospheres hatch from the eggs, penetrate the intestinal wall, and are carried by the bloodstream to the liver, lungs, or other organs, turning into hydatid cysts filled with fluid and protoscoleces. Protoscoleces, as the key infectious component of the larval stages, can interact with both definitive and intermediate hosts. They can asexually proliferate in intermediate hosts and mature into adult parasites when the hydatid cysts are swallowed by the definitive host (McManus et al., 2003). The parasites have exploited daedal strategies to adapt to the distinct environments of different hosts and to evade host immune responses, yet the concrete mechanisms of these strategies remain poorly understood. Alternative splicing, as an essential characteristic of eukaryotic gene expression, is a post-transcriptional process that enables one gene to encode two or more mature mRNAs (Nilsen and Graveley, 2010). It plays a key role in the transmission of genetic information from DNA to proteins by expanding the coding capacity of genomes (Naftelberg et al., 2015). Alternative splice products may generate proteins with different structures, sub-cellular localizations, and molecular functions, some of which are putative targets for vaccines or drugs (Keren et al., 2010; Yeoh et al., 2015). Regulation of AS is as important as regulation of transcription to determine cell- and tissue-specific features, normal cell functioning, and responses of eukaryotic cells to external cues (Kornblihtt et al., 2013; Naftelberg et al., 2015). For the parasite, AS may play important roles in the host–parasite interplay through generating alternative isoforms with different functions (Hull and Dlamini, 2014). The discovery of AS giving rise to different isoforms of antigenic proteins seems to be indicative of a further means of immune evasion by the parasite (DeMarco et al., 2010; Hull and Dlamini, 2014). As one of the next-generation sequencing approaches, RNA-Seq has been applied in numerous studies that have altered our view of the extent and complexity of eukaryotic transcriptomes (Wang et al., 2009; Marguerat and Bahler, 2010). This technique enables researchers to survey the entire transcriptome in a very high-throughput and quantitative manner, which provides inviting information about transcriptional and post-transcriptional gene regulation, and gives unique insight into the variety of transcript structures and processing on a global scale (Mortazavi et al., 2008; Sultan et al., 2008). In particular, RNA-Seq data can be directly used to identify AS events in transcriptomes by searching for sequence reads that span splice junctions, which has dramatically enriched our knowledge of AS in the eukaryotic kingdom (Martin and Wang, 2011; Kornblihtt et al., 2013; Staiger and Brown, 2013; Hull and Dlamini, 2014; Filichkin et al., 2015). Recently, the decoding and availability of the genome sequences of two Echinococcus species has proved pivotal for the systematic dissection of the biology of the parasites (Tsai et al., 2013; Zheng et al., 2013). The main obstacle lies in that a precise estimate of AS events cannot be obtained merely based on genomic sequence analysis. As far as we know, deep-sequencing research explicitly related to AS in Echinococcus species has not been conducted, even though a few studies focusing on transcriptome profile analyses of E. granulosus (Pan et al., 2014) or E. multilocularis (Huang et al., 2016) have been reported using next-generation sequencing approaches. In the present study, we have investigated AS events in protoscolex transcriptomes of both E. granulosus and E. multilocularis using a high-throughput RNA-Seq approach. We identified 1000s of AS events in the two parasites, which can be classified into seven splicing patterns. Interestingly, intron retention was found to be the predominant splicing type in protoscoleces. This provides a valuable basis for a more precise, dynamic understanding of the transcriptome in Echinococcus species.

Materials and Methods

Ethics Statement

All procedures performed on animals in this study were conducted following the animal husbandry guidelines of the Chinese Academy of Medical Sciences and with permission from the Experimental Animal Committee of the Chinese Academy of Medical Sciences with the Ethical Clearance Number IPB-2011-8.

Parasites

All parasite materials were collected from the Xinjiang Uyghur Autonomous Region, China. For E. granulosus transcriptome sequencing, a unilocular cyst was isolated from a sheep liver infected with E. granulosus. After rinsing 10 times with normal saline, protoscoleces were aspirated from the hydatid cysts, washed three times with normal saline and then soaked in RNAlater solution (Ambion, CA, United States); samples were stored at -80°C until total RNA was isolated. Hydatid cysts of E. multilocularis were freshly isolated from male gerbils at 5–6 months after infection with E. multilocularis. After rinsing with normal saline, the cysts were opened and homogenized to a smooth consistency in 200 ml normal saline. The suspension was successively passed through 40, 80, 140, and 325 mesh metal sieves. The protoscoleces on the 325 mesh metal sieve were collected into a beaker and were washed five times with ice-cold normal saline. The storage method was as above.

RNA Extraction and Deep Sequencing of mRNA

For mRNA library construction and deep sequencing, total RNA was isolated using an RNeasy Mini kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. Potential contaminating genomic DNA was removed from RNA samples using a Turbo DNA-free kit (Ambion, CA, United States). The quantity and integrity of the total RNA was assessed using an Agilent 2100 Bioanalyzer (Agilent Technologies, United States). Normalized starting quantities of total RNA were used to prepare two biological replicates of Illumina sequencing libraries for each species using a TruSeqTM RNA sample preparation kit (Illumina, San Diego, CA, United States). Following the manufacturer’s instructions, libraries were constructed and sequenced using a paired-end read protocol with 100 bp of data collected per run on an Illumina HiSeqTM 2000 instrument housed at MininGene Biotechnology Co. Ltd, Beijing, China. Data analyses and base calling were performed using the Illumina instrument software.

mRNA Transcriptome Data Processing and Bioinformatic Analyses

A Perl program was written to remove low-quality reads (more than half of the bases had a quality value less than 5). The retained high-quality reads were mapped to the published genomes of E. granulosus (Tsai et al., 2013; Zheng et al., 2013) and E. multilocularis (Tsai et al., 2013) by HISAT (Kim et al., 2015), and then assembled with StringTie (Pertea et al., 2015) to construct unique transcript sequences, using the parameters: -g -b -u -o. AS events in a gene were detected by SUPPA (Alamancos et al., 2015). Cuffquant (Trapnell et al., 2010) was used to quantify the expression of genes and transcripts, and the raw expression values were normalized by Cuffnorm. The expression values were calculated for each sample based on the number of fragments per kilobase of exon per million reads mapped. InterPro domain information (Mulder et al., 2003) was annotated by InterProScan (release 53.0) (Zdobnov and Apweiler, 2001) and functional assignments mapped with GO (Harris et al., 2004). WEGO (v3.3) (Ye et al., 2006) was used for GO classification. To obtain an overview of the gene pathway networks, KEGG analysis was performed by comparing the transcripts with the KEGG database (release 58) (Kanehisa et al., 2006) using BLASTX (Altschul et al., 1997; Ouyang et al., 2007) at E-values ≤ 1E-10. A Perl script was used to retrieve KO information from the BLAST results, and then establish pathway associations between the transcripts and database. A Perl script was written to count the number of total genes and the number of AS genes in a certain functional pathway, then a significant score P-value was calculated using the R scripts under hypergeometric distribution. A function pathway with P-value < 0.05 was deemed as enrichment in AS genes.

Verification of AS Events by RT-PCR

To validate AS events experimentally, RT-PCR was performed for eight pairs of homologous genes that underwent different AS types in the two parasites. For each sample, 1 μg total RNA was reverse transcribed into first-strand cDNA using a SuperScript III Reverse Transcriptase kit (Invitrogen, Carlsbad, CA, United States) with oligo(dT)12-18 primer. Primers were designed based on the consensus gene sequences of the two parasites, and the primer sequences are listed in Supplementary Table S1. PCR was performed in a 25 μl reaction system using High-Fidelity PCR Master Mix (NEB) and the procedure was as follows: initial denaturation at 98°C for 1 min; 98°C for 10 s, 60°C for 30 s, and 72°C for 30 s, for 35 cycles; and a final extension at 72°C for 5 min. The PCR products were visualized by 1.5% agarose gel electrophoresis analysis.

Identification and Phylogenetic Analysis of the Echinococcus Venom-Allergen-Like Protein (VAL) Gene Family

Using protein sequences of Schistosoma mansoni VALs (Chalmers et al., 2008) as query sequences, the initial protein sequences of Echinococcus VALs were retrieved from the non-redundant protein sequence database of the National Center for Biotechnology Information by BLASTP search (E-value cut-off: 10-5). CD-HIT v4.5.4 software[1] was then used to eliminate redundant protein sequences. All remaining protein sequences were further examined for the presence of SCP/TAPS-representative protein domains by searching the Conserved Domain Database (Marchler-Bauer et al., 2011) of the National Center for Biotechnology Information. A phylogenetic relationship tree was built using the full-length amino acid sequences of Echinococcus VALs in the following steps. First, sequences were aligned using ClustalX (Larkin et al., 2007), then they were refined manually, and a phylogenetic tree was finally generated using MEGA 5.0 software (Tamura et al., 2011) by the neighbor-joining method (the bootstrap test was performed with 1000 replicates).

Results

Transcriptome Sequencing and Assembly

In this study, a total of 43,509,848 and 44,379,588 high-quality reads (100 bp/read) were obtained from worms of E. granulosus and E. multilocularis, respectively. Low-quality sequences were removed by a Perl program, and the retained high-quality reads were mapped to the published genomes of E. granulosus and E. multilocularis to construct unique transcript sequences. A total of 34,717,856 reads (79.79%) mapped to the E. granulosus genome reported by Tsai et al. (2013), 31,156,736 reads (71.61%) mapped to the E. granulosus genome reported by Zheng et al. (2013), and 38,882,179 reads (87.61%) mapped to the E. multilocularis genome (Tsai et al., 2013) (Table ). Therefore, all the subsequent assemblies of reads and bioinformatics analyses, such as gene annotation and AS events, were based on the first mapping results for E. granulosus. Using the software StringTie, 24,550 (7,925 known and 16,625 novel transcripts) and 23,771 transcripts (8,432 known and 15,339 novel transcripts) were assembled for E. granulosus and E. multilocularis, respectively, and the assembly yielded 11,330 genes (6,815 known and 4,515 novel genes) for E. granulosus and 10,101 genes (7,051 known and 3,050 novel genes) for E. multilocularis (Table and Supplementary Tables S2, S3), compared with the reference genome data. The mean length of transcripts for the two parasites was 2016 and 2086 bp, respectively (Table ). All sequence data has been deposited in the GEO database[2] with an accession number of GSE59173. Summary data of the transcriptome analysis. The sequence reads could be classified into four types: exon, intron, intergenic, and spliced. The proportions of the four sequence types in the E. granulosus transcriptome were 56% (exons), 4% (introns), 15% (intergenic), and 25% (spliced) (Figure ). The percentages of exons and introns in the E. multilocularis transcriptome were 55 and 3%, which is similar to the E. granulosus transcriptome; while the proportions of intergenic and spliced transcripts in the E. multilocularis transcriptome were 24 and 18%, which is significantly different from the proportions in the E. granulosus transcriptome (Figure ). In our transcriptome database, the total exon lengths for E. granulosus and E. multilocularis were 22,249,494 bp (19.80% of the genome) and 21,805,251 bp (19.17% of the genome), which were longer than the total exon lengths in the reference genome databases for E. granulosus (15,213,358 bp, 13.54%) and E. multilocularis (16,112,293 bp, 14.16%) (Tsai et al., 2013). Proportions of the four types of sequence reads mapping to the genomes of . The percentages of exon (the highest) and intron (the lowest) sequences were similar between the two parasites, whereas the percentages of spliced and intergenic sequences were significantly different between the two parasites.

A Large Number of AS Events Were Identified in E. granulosus and E. multilocularis

A total of 6,826 AS events from 3,774 E. granulosus genes (33.31%) and 6,644 AS events in 3,611 E. multilocularis genes (35.75%) were bioinformatically predicted (Figure and Supplementary Tables S4, S5). These AS events could be classified into seven types: intron retention, exon skipping, alternative donor site, alternative acceptor site, alternative first exon, alternative last exon, and mutually exclusive exon. The results showed that the proportions of the seven AS types were similar in E. granulosus and E. multilocularis. Intron retention (39 and 42%) was the most common AS type, followed by alternative donor site (21 and 17%), alternative acceptor site (17 and 18%), exon skipping (both 16%), and the least common splicing forms were alternative first exon (both 4%), alternative last exon (both 2%), and mutually exclusive exon (both 1%) in the two parasites, E. granulosus and E. multilocularis, respectively (Figure and Supplementary Tables S4, S5). General information about AS events in Prevalence of AS in protoscolex transcriptomes of the two parasites. (B) Proportions of the seven types of AS in protoscolex transcriptomes of the two parasites. EG, E. granulosus; EM, E. multilocularis; IR, intron retention; ADS, alternative donor site; AAS, alternative acceptor site; ES, exon skipping; AFE, alternative first exon; ALE, alternative last exon; MXE, mutually exclusive exon.

A Panel of AS Events Were Validated by RT-PCR

To validate the AS events predicted by bioinformatic analysis, RT-PCR was carried out on eight pairs of homologous genes producing different types of AS events in the two parasites. Four pairs of homologous genes with intron-retention events were serine/arginine-rich splicing factor (EG.4660 and EM.3827), SNARE protein Sec22 (EG.5720 and EM.2557), hypothetical protein (EG.3709 and EM.5622), and hypothetical protein (EG.5720 and EM.2557). The other four pairs of homologous genes with exon skipping events were peroxidasin (EG.9768 and EM.6188), filamin (EG.8983 and EM.6048), adducin-related protein (EG.7075 and EM.1228), and heat shock protein 70 (EG.583 and EM.6781). Agarose gel electrophoresis showed that bands corresponding with the predicted AS events could be detected in the RT-PCR products (Figures , ). Notably, some extra bands detected in the RT-PCR products were proved to be additional AS events from the corresponding genes by Sanger sequencing method (Figures , ). The results were indicative that the bioinformatic prediction based on the primary sequencing data was reliable, and additional AS events for some genes, which could not be sequenced nor predicted in the present study, also occurred in the two parasites. RT-PCR validation of intron-retention events in . A subset of intron-retention events (A–D) in protoscoleces were confirmed by RT-PCR. Paired arrows in the gene frames indicate where primers were designed for each AS event. RT-PCR products were visualized with 1.5% agarose gels. Electrophoretic bands corresponding with predicted AS events in E. granulosus and E. multilocularis are marked by red and blue boxes, respectively. M, marker; EG, E. granulosus; EM, E. multilocularis. RT-PCR validation of exon skipping events in . A subset of four exon skipping events (A–D) in protoscoleces were confirmed by RT-PCR. Paired arrows in the gene frames indicate where primers were designed for each AS event. RT-PCR products were visualized with 1.5% agarose gels. Electrophoretic bands corresponding with predicted AS events in E. granulosus and E. multilocularis are marked by red and blue boxes, respectively. M, marker; EG, E. granulosus; EM, E. multilocularis.

GO and KEGG Pathway Analyses of AS Genes

Gene Ontology analysis was performed to summarize and explore the functional categories of the genes that underwent AS events in the two parasites. A total of 2,364 AS genes of E. granulosus and 2,376 AS genes of E. multilocularis were annotated for different GO terms (Supplementary Tables S3, S4). The results showed that the proportions of enriched GO terms for AS genes were globally similar between E. granulosus and E. multilocularis based on biological process (Figure ), molecular function (Figure ), and cellular component (Figure ). For biological processes, genes with AS events were predominantly enriched in GO terms that were relevant to cellular process and metabolic process. For the molecular function category, the majority of AS genes were annotated with the term binding, followed by the term catalytic activity. Comparative GO analysis of genes with AS events between . GO terms were summarized according to three categories: biological process (A), molecular function (B), and cellular component (C). The outer ring represents AS genes of E. multilocularis, and the inner ring represents AS genes of E. granulosus. Numbers in brackets represent the AS genes of E. granulosus and E. multilocularis falling into the corresponding GO terms. Moreover, 1,099 and 1,115 AS genes were annotated for KEGG pathways. The enrichment pathway analysis showed that these genes were significantly enriched in 26 pathways relating to cellular processes, environmental information processing, genetic information processing, metabolism, and organismal systems (P < 0.05, Table ). For environmental information processing, the AS genes were mainly enriched in the phosphatidylinositol signaling system, cellular antigens, and Jak-STAT, VEGF, and Notch signaling pathways, suggesting that AS may play important roles in the response to host environmental stress. For metabolism, the most significantly enriched pathways were involved in purine, fatty acid, galactose, glycerolipid, or N-glycan metabolism or biosynthesis. For genetic information processing, the enriched AS genes were grouped mainly in basal transcription factors, RNA transport, and the mRNA surveillance pathway. For organismal systems, the significant enrichment of AS genes included those relating to development (osteoclast differentiation), the immune system (Fc epsilon RI, T cell receptor, and Toll-like receptor signaling pathway), and the endocrine system (GnRH signaling pathway) (Table ). Further experimental verification is necessary to clarify the biological regulation mechanisms of these AS events in Echinococcus species. Pathway enrichment analysis of AS genes in E. granulosus and E. multilocularis.

Sequence Conservation Analysis of Echinococcus AS Genes

Using AS gene sequences of E. multilocularis as the query database, we performed a BLASTX search to retrieve homologous genes in the following five organisms: E. granulosus, S. japonicum, S. mansoni, Caenorhabditis elegan, and Homo sapiens. A total of 3389 of E. multilocularis AS genes were identified as having homologous genes in the five organisms. Venn diagram analysis showed that there were three major groups of E. multilocularis AS genes: group I (conserved genes, 158 genes) (Figure and Supplementary Table S6), for which homologous genes could be retrieved in all five organisms; group II (Platyhelminthes genes, 300 genes) (Figure and Supplementary Table S7), for which homologous genes could only be retrieved in E. granulosus, S. japonicum, and S. mansoni; and group III (Cestoidea genes, 2360 genes) (Figure and Supplementary Table S8), for which homologous genes could only be retrieved in E. granulosus. Moreover, the percentages of E. granulosus genes with AS events from the three groups were compared to investigate whether AS events were related to their gene sequence conservation. In group I, 63.3% of E. granulosus homologs with AS events were identified, which was lower than the percentages for group II (66.7%), and group III (67.8%) (Figure ). Meanwhile, we compared the proportions of S. japonicum genes with AS events from group I and group II using our published S. japonicum transcriptome data (Piao et al., 2014). The percentage of S. japonicum homologs with AS events in group I (67.1%) was lower than the percentage in group II (73.0%) (Figure ). The representative homologous gene pairs with AS events from the three groups are listed in Table . Sequence conservation analysis of Venn diagram of Echinococcus AS genes with homologous genes in the five organisms. (B) The percentages of E. granulosus genes with AS events from the three groups. (C) The percentages of S. japonicum genes with AS events from groups I and II. Representative homologous gene pairs with AS events from groups I, II, and III.

Phylogenetic Relationship and AS Analyses of the Echinococcus VAL Gene Family

The VAL genes are members of the sperm-coating protein/Tpx-1/Ag5/PR-1/Sc7 (SCP/TAPS) superfamily, which may be important for chronic host/parasite interactions (Chalmers et al., 2008; Farias et al., 2012). It has been found that members of the VAL family exhibit multiple expression patterns and AS in the life cycle of S. mansoni (Chalmers et al., 2008) and S. japonicum (Liu et al., 2015). Based on the genome databases of the two species, we identified 22 E. granulosus VALs and 19 E. multilocularis VALs with complete SCP/TAPS domains. Next, a phylogenetic tree was reconstructed to examine the phylogenetic relationships among VALs in the two Echinococcus species. The result showed that the Echinococcus VAL family was divided into three major groups, and the majority of genes had homologous counterparts between the two Echinococcus species (Figure ). Using these nucleotide sequences of Echinococcus VALs as query sequences, local BLASTN searches were performed to identify the corresponding mRNA sequences in our RNA-Seq data. Nine E. granulosus VALs and nine E. multilocularis VALs were proved to be transcribed in protoscoleces (Figure ), and the others may be expressed in other developmental stages or the amount of their transcription was too low to be detected by our study. Interestingly, AS transcripts of E. granulosus VALs (CDS24051.1 and CDS22593.1) and the E. multilocularis VAL (CDS35801.1) from group II were discovered in our RNA-Seq data (Figure ). Therefore, AS events may be ubiquitous in the VAL family of parasitic platyhelminths, and is well worth further study. Molecular phylogenetic relationship of the VAL gene family of . The tree was divided into three phylogenetic groups and the bootstrap values (more than 50) are shown at the nodes. Asterisks represent genes transcribed in protoscoleces and black dots represent genes with AS. Eg, E. granulosus; Em, E. multilocularis.

Discussion

As one of major fields of enquiry in the post-genomic era, AS has been investigated in an increasing number of species over the past decade (Kalsotra and Cooper, 2011; Filichkin et al., 2015; Lunghi et al., 2016). It has been shown that AS plays a crucial role in enhancing transcriptomic and proteomic diversity, and its prevalence and characteristics vary considerably at many different levels, such as in species, tissues, and developmental stage (Kornblihtt et al., 2013). In this study, to the best of our knowledge, we have presented for the first time, a genome-wide analysis of AS events in protoscolex transcriptomes of E. granulosus and E. multilocularis. We found that approximately 33–36% of genes in the two Echinococcus species underwent AS events. This result is close to the AS percentage in S. japonicum (35%) (Piao et al., 2014) and Drosophila melanogaster (31%) (Graveley et al., 2011), and higher than the percentages in Danio rerio (17%) (Lu et al., 2010) and C. elegans (25%) (Ramani et al., 2011), but far below the percentage in H. sapiens (over 90%) (Pan et al., 2008; Wang et al., 2008). It is known that AS within the same species can be affected by developmental stage, tissue type, and growth conditions. Indeed, a higher AS percentage was obtained in schistosomulum and adult stages of S. japonicum (Wang et al., 2015), compared with the AS percentage in the adult stage only (Piao et al., 2014). Our estimate of AS events in Echinococcus species could be underestimated, since the parasite experiences several distinct environments during its complex life cycle, yet we only sequenced samples for one stage. We would expect a higher AS percentage to be observed in Echinococcus species when more developmental stages (such as adult, egg, and oncosphere) are analyzed. Alternative splicing events can be commonly classified into seven types: intron retention, exon skipping, alternative donor site, alternative acceptor site, alternative first exon, alternative last exon, and mutually exclusive exon (Black, 2003). Previous studies have revealed that exon skipping, alternative donor site, alternative acceptor site, and intron retention are the four basic AS types (Nilsen and Graveley, 2010), and the overall AS patterns vary across species, tissue types, and developmental stages (Wang et al., 2008; Gibilisco et al., 2016). Exon skipping is thought to be the most prevalent AS type in animals, whereas intron retention represents the most common AS form in plants and unicellular eukaryotes (Barbazuk et al., 2008). Unexceptionally, the four major types of AS covered the vast majority of AS events in this study. Intron retention is a process by which specific introns, or fragments thereof, remain unspliced in polyadenylated transcripts. It is noteworthy that intron retention, but not exon skipping, was the predominant AS type in protoscoleces based on our high-coverage poly(A)+ RNA-Seq data. The advent of new sequencing technologies and algorithms is providing growing evidence that intron retention is a universal mechanism common to many eukaryotic organisms (Lunghi et al., 2016). By applying a new pipeline for intron-retention detection to high-coverage poly(A)+ RNA-Seq data, Braunschweig et al. (2014) found that intron retention is far more frequent in mammals than previously believed, affecting transcripts from as many as three-quarters of intron-containing genes. Meanwhile, previous studies have shown that intron retention is developmentally regulated and specific to the developmental phase or tissue (Boothby et al., 2013; Wang et al., 2016). In a study by Dvinge and Bradley (2015), it was found that an increase in retained introns is common in most cancer transcriptomes affecting genes involved in RNA processing and nuclear export. Moreover, the identification of increasing numbers of regulated intron-retention events in multi-stage unicellular parasites and plants suggests that these mechanisms can facilitate adaptation to acute host environmental changes, or duly supply key molecules to invade and exploit different ecological niches (Filichkin et al., 2015; Lunghi et al., 2016). The amazingly high frequency of intron retention in protoscoleces of Echinococcus species revealed that intron retention may play crucial roles in the development of the parasite and in host–parasite interaction. It will be fascinating to investigate the diversity of AS occurring in different developmental stages of Echinococcus species in the future. Deeper mechanistic research of intron retention will provide novel insight into its role in regulating gene expression during the life cycle of the parasites. Intron-retention mechanisms deserve to be elevated from paltry biological events to key elements that facilitate the functional tuning of cells, tissues or even developmental stages of numerous eukaryotes, especially for animals. Gene Ontology analysis was performed to summarize and explore the functional categories of AS genes identified in this study. The results indicated that the proportions of enriched GO categories between the two parasites were generally consistent with each other in this study. Meanwhile, the KEGG pathway enrichment analysis showed that these AS genes tended to be enriched in multiple pathways, such as numerous metabolic pathways (e.g., purine, fatty acid, galactose, and glycerolipid metabolism) and key signaling pathways (e.g., Jak-STAT, VEGF, Notch, and GnRH signaling pathways). Interestingly, we found that 30 genes that underwent AS events were significantly enriched in mRNA surveillance pathways (P < 0.001), which are utilized by organisms to ensure the quality and fidelity of mRNA. As one of the well-known mRNA surveillance pathways (Lykke-Andersen and Jensen, 2015), the NMD pathway not only targets mRNAs containing premature termination codons for degradation, but also plays a key role in the regulation of physiological gene expression (Chang et al., 2007; Fatscher et al., 2015; Hug et al., 2016). Studies by Filichkin et al. (2010) and Sorber et al. (2011) are suggestive that a proportion of premature termination codon-containing transcripts generated by AS events may either produce truncated protein isoforms or tune gene expression post-transcriptionally by the NMD pathway. Although this highly conserved pathway has been reported in numerous eukaryotes, the NMD pathway has yet to be shown in Echinococcus species. Using human core members (UPF1, UPF2, and UPF3) of the NMD pathway as query sequences, we identified homologs to all three in the genomes of E. granulosus (CDS17643.1, CDS16560.1, and CDS18511.1) (Tsai et al., 2013; Zheng et al., 2013) and E. multilocularis (CDS42690.1, CDS41342.1, and CUT99229.1) (Tsai et al., 2013), suggesting the NMD pathway exists in Echinococcus species.

Conclusion

We have performed what is, to the best of our knowledge, the first genome-wide survey of AS events in Echinococcus species using a next-generation sequencing technology. We found that the AS prevalence was about 33–36%, and intron retention was the predominant AS type in the protoscolex transcriptomes of E. granulosus and E. multilocularis. Genes that underwent AS events were significantly enriched in dozens of pathways mainly related to metabolism, signal transduction, and genetic information processing. These results will not only significantly improve the re-annotation of Echinococcus genomes, but also provide an invaluable resource for future functional and evolutionary studies of AS in the platyhelminth parasites.

Author Contributions

Experiments were conceived and designed by SL and QC. Experiments were performed by XZ, SL, LH, XP, and NH. Data was analyzed by SL and XZ. The paper was written by SL, XZ, and QC.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Table 1

Summary data of the transcriptome analysis.

ClassE. granulosusE. multilocularis
Number of single reads43,509,84844,379,588
Number of reads and percentages mapped to genome34,717,856 (79.79%)a38,882,179 (87.61%)
31,156,736 (71.61%)b
Assembled genes (loci)11,33010,101
Known6,8157,051
Novel4,5153,050
Assembled transcripts24,55023,771
Known7,9258,432
Novel16,62515,339
Mean length of transcripts2,0162,087
Table 2

Pathway enrichment analysis of AS genes in E. granulosus and E. multilocularis.

Pathway IDDescriptionE. granulosus
P-valueE. multilocularis
P-value
All genesAS genesAll genesAS genes
Cellular processes
ko04111Cell cycle – yeast45222.16E - 0250311.67E - 04
ko04113Meiosis – yeast40219.36E - 0346258.56E - 03
Environmental information processing
ko04070Phosphatidylinositol signaling system25161.62E - 0326224.90E - 07
ko04630Jak-STAT signaling pathway1293.82E - 031181.50E - 02
ko04370VEGF signaling pathway22131.15E - 0223154.24E - 03
ko04330Notch signaling pathway1593.06E - 0213103.26E - 03
ko04090Cellular antigens29153.08E - 0228182.22E - 03
Genetic information processing
ko03022Basal transcription factors26142.45E - 0225171.17E - 03
ko03015mRNA surveillance pathway55309.30E - 0453317.20E - 04
ko03013RNA transport92481.38E - 0487474.58E - 04
Metabolism
ko00510N-glycan biosynthesis30201.88E - 0430236.78E - 06
ko00562Inositol phosphate metabolism20133.67E - 0320188.70E - 07
ko00513Various types of N-glycan biosynthesis26181.92E - 0423172.47E - 04
ko00071Fatty acid metabolism1071.95E - 021073.09E - 02
ko00280Leucine and isoleucine degradation766.82E - 03862.96E - 02
ko00052Galactose metabolism19112.38E - 0219114.33E - 02
ko00640Propanoate metabolism1293.82E - 031391.56E - 02
ko00563GPI-anchor biosynthesis17117.92E - 0316102.85E - 02
ko00061Fatty acid biosynthesis333.69E - 02334.70E - 02
ko00561Glycerolipid metabolism18138.38E - 0420138.04E - 03
ko00230Purine metabolism84412.26E - 0386411.77E - 02
Organismal systems
ko04380Osteoclast differentiation17117.92E - 0316102.85E - 02
ko04664Fc epsilon RI signaling pathway17102.71E - 0216118.04E - 03
ko04912GnRH signaling pathway26134.78E - 0231172.52E - 02
ko04660T cell receptor signaling pathway31171.09E - 0231181.03E - 02
ko04620Toll-like receptor signaling pathway14104.00E - 0314107.72E - 03
Table 3

Representative homologous gene pairs with AS events from groups I, II, and III.

AS
GroupEGa geneEM geneAnnotationAASADSAFEALEMXEIRES
IEG.664EM.6854Serine/threonine protein kinase+/•b
EG.5998EM.9833Similar to β-tubulin++/•+/•+/•
EG.7097EM.1247Dynamin++/•+/•+/•
EG.3316EM.4881Vacuolar proton ATPases+/•+++/•
EG.1088EM.7236Pre-mRNA-splicing factor CLF1+
EG.665EM.6855Phospholipase C+++/•++/•
EG.5106EM.4247Multiple ankyrin repeats single KH domain protein+/•+/•
EG.5383EM.2230N-Acetyltransferase 10++/•
EG.3284EM.4849Developmentally regulated GTP-binding protein 2++/•
EG.9830EM.6246Eukaryotic translation initiation factor 2c++/•
IIEG.9768EM.6188Peroxidasin+
EG.8983EM.6048Filamin+/•
EG.7075EM.1228Adducin-related protein+/•
EG.4660EM.3827Serine/arginine rich splicing factor+/•
EG.5720EM.2557SNARE protein Sec22+/•
EG.2713EM.8982Voltage-dependent calcium channel+/•+/•+/•+/•
EG.215EM.6429Hypothetical protein+/•
EG.3215EM.9455Aspartyl aminopeptidase++/•+/•
EG.8105EM.327Venom allergen-like protein++/•
EG.4029EM.5528Hypothetical protein+/•+/•
IIIEG.583EM.6781Heat shock protein 70+/•
EG.10583EM.149Acyl-CoA dehydrogenase 9+/•+/•
EG.3709EM.5622Hypothetical protein+/•
EG.5622EM2474Hypothetical protein+/•+/•
EG.1536EM7652Hypothetical protein++/•
EG.8356EM581Tegument antigen+/•
EG.7014EM.1953PDZ domain-containing protein GIPC3-like++/•
EG.1931EM.8226Hypothetical protein+/•+/•++/•+/•
EG.2541EM.8817Transmembrane protein 144+/•+
EG.2553EM.8828Tetraspanin+
  60 in total

1.  The InterPro Database, 2003 brings increased coverage and new features.

Authors:  Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Daniel Barrell; Alex Bateman; David Binns; Margaret Biswas; Paul Bradley; Peer Bork; Phillip Bucher; Richard R Copley; Emmanuel Courcelle; Ujjwal Das; Richard Durbin; Laurent Falquet; Wolfgang Fleischmann; Sam Griffiths-Jones; Daniel Haft; Nicola Harte; Nicolas Hulo; Daniel Kahn; Alexander Kanapin; Maria Krestyaninova; Rodrigo Lopez; Ivica Letunic; David Lonsdale; Ville Silventoinen; Sandra E Orchard; Marco Pagni; David Peyruc; Chris P Ponting; Jeremy D Selengut; Florence Servant; Christian J A Sigrist; Robert Vaughan; Evgueni M Zdobnov
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

2.  A serine-arginine-rich (SR) splicing factor modulates alternative splicing of over a thousand genes in Toxoplasma gondii.

Authors:  Lee M Yeoh; Christopher D Goodman; Nathan E Hall; Giel G van Dooren; Geoffrey I McFadden; Stuart A Ralph
Journal:  Nucleic Acids Res       Date:  2015-04-13       Impact factor: 16.971

3.  A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome.

Authors:  Marc Sultan; Marcel H Schulz; Hugues Richard; Alon Magen; Andreas Klingenhoff; Matthias Scherf; Martin Seifert; Tatjana Borodina; Aleksey Soldatov; Dmitri Parkhomchuk; Dominic Schmidt; Sean O'Keeffe; Stefan Haas; Martin Vingron; Hans Lehrach; Marie-Laure Yaspo
Journal:  Science       Date:  2008-07-03       Impact factor: 47.728

4.  Removal of retained introns regulates translation in the rapidly developing gametophyte of Marsilea vestita.

Authors:  Thomas C Boothby; Richard S Zipper; Corine M van der Weele; Stephen M Wolniak
Journal:  Dev Cell       Date:  2013-02-21       Impact factor: 12.270

5.  Genome-wide analysis of alternative splicing in Caenorhabditis elegans.

Authors:  Arun K Ramani; John A Calarco; Qun Pan; Sepand Mavandadi; Ying Wang; Andrew C Nelson; Leo J Lee; Quaid Morris; Benjamin J Blencowe; Mei Zhen; Andrew G Fraser
Journal:  Genome Res       Date:  2010-12-22       Impact factor: 9.043

Review 6.  Biological, epidemiological, and clinical aspects of echinococcosis, a zoonosis of increasing concern.

Authors:  Johannes Eckert; Peter Deplazes
Journal:  Clin Microbiol Rev       Date:  2004-01       Impact factor: 26.132

7.  Genome-wide mapping of alternative splicing in Arabidopsis thaliana.

Authors:  Sergei A Filichkin; Henry D Priest; Scott A Givan; Rongkun Shen; Douglas W Bryant; Samuel E Fox; Weng-Keen Wong; Todd C Mockler
Journal:  Genome Res       Date:  2009-10-26       Impact factor: 9.043

Review 8.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

9.  The developmental transcriptome of Drosophila melanogaster.

Authors:  Brenton R Graveley; Angela N Brooks; Joseph W Carlson; Michael O Duff; Jane M Landolin; Li Yang; Carlo G Artieri; Marijke J van Baren; Nathan Boley; Benjamin W Booth; James B Brown; Lucy Cherbas; Carrie A Davis; Alex Dobin; Renhua Li; Wei Lin; John H Malone; Nicolas R Mattiuzzo; David Miller; David Sturgill; Brian B Tuch; Chris Zaleski; Dayu Zhang; Marco Blanchette; Sandrine Dudoit; Brian Eads; Richard E Green; Ann Hammonds; Lichun Jiang; Phil Kapranov; Laura Langton; Norbert Perrimon; Jeremy E Sandler; Kenneth H Wan; Aarron Willingham; Yu Zhang; Yi Zou; Justen Andrews; Peter J Bickel; Steven E Brenner; Michael R Brent; Peter Cherbas; Thomas R Gingeras; Roger A Hoskins; Thomas C Kaufman; Brian Oliver; Susan E Celniker
Journal:  Nature       Date:  2010-12-22       Impact factor: 49.962

10.  From genomics to chemical genomics: new developments in KEGG.

Authors:  Minoru Kanehisa; Susumu Goto; Masahiro Hattori; Kiyoko F Aoki-Kinoshita; Masumi Itoh; Shuichi Kawashima; Toshiaki Katayama; Michihiro Araki; Mika Hirakawa
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

View more
  10 in total

1.  Identification of Functional MKK3/6 and MEK1/2 Homologs from Echinococcus granulosus and Investigation of Protoscolecidal Activity of Mitogen-Activated Protein Kinase Signaling Pathway Inhibitors In Vitro and In Vivo.

Authors:  Chuanshan Zhang; Jing Li; Tuerganaili Aji; Liang Li; Xiaojuan Bi; Ning Yang; Zhide Li; Hui Wang; Rui Mao; Guodong Lü; Yingmei Shao; Dominique A Vuitton; Hao Wen; Renyong Lin
Journal:  Antimicrob Agents Chemother       Date:  2018-12-21       Impact factor: 5.191

2.  Transcriptomic Features of Echinococcus granulosus Protoscolex during the Encystation Process.

Authors:  Junjie Fan; Hongye Wu; Kai Li; Xunuo Liu; Qingqing Tan; Wenqiao Cao; Bo Liang; Bin Ye
Journal:  Korean J Parasitol       Date:  2020-06-26       Impact factor: 1.341

3.  Profiling Novel Alternative Splicing within Multiple Tissues Provides Useful Insights into Porcine Genome Annotation.

Authors:  Wen Feng; Pengju Zhao; Xianrui Zheng; Zhengzheng Hu; Jianfeng Liu
Journal:  Genes (Basel)       Date:  2020-11-26       Impact factor: 4.096

4.  Intestinal transcriptomes in Kazakh sheep with different haplotypes after experimental Echinococcus granulosus infection.

Authors:  Xin Li; Song Jiang; Xuhai Wang; Bin Jia
Journal:  Parasite       Date:  2021-03-05       Impact factor: 3.000

5.  iTRAQ-based comparative proteomic analysis in different developmental stages of Echinococcus granulosus.

Authors:  Xin Li; Song Jiang; Xuhai Wang; Wenqiao Hui; Bin Jia
Journal:  Parasite       Date:  2021-03-05       Impact factor: 3.000

6.  Transcriptome analysis of Echinococcus granulosus sensu stricto protoscoleces reveals differences in immune modulation gene expression between cysts found in cattle and sheep.

Authors:  Ismael Pereira; Christian Hidalgo; Caroll Stoore; María Soledad Baquedano; Carolina Cabezas; Macarena Bastías; Aníbal Riveros; Claudio Meneses; Martín Cancela; Henrique Bunselmeyer Ferreira; Leonardo Sáenz; Rodolfo Paredes
Journal:  Vet Res       Date:  2022-01-28       Impact factor: 3.683

7.  Comparative transcriptome profiles of Schistosoma japonicum larval stages: Implications for parasite biology and host invasion.

Authors:  Shaoyun Cheng; Bingkuan Zhu; Fang Luo; Xiying Lin; Chengsong Sun; Yanmin You; Cun Yi; Bin Xu; Jipeng Wang; Yan Lu; Wei Hu
Journal:  PLoS Negl Trop Dis       Date:  2022-01-13

8.  Genome-wide transcriptome analysis of the early developmental stages of Echinococcus granulosus protoscoleces reveals extensive alternative splicing events in the spliceosome pathway.

Authors:  Mohammad Ali Mohammadi; Majid Fasihi Harandi; Donald P McManus; Mehdi Mansouri
Journal:  Parasit Vectors       Date:  2021-11-12       Impact factor: 3.876

9.  Coding and Noncoding RNA Expression Profiles of Spleen CD4+ T Lymphocytes in Mice with Echinococcosis.

Authors:  Songhao Yang; Xiancai Du; Chan Wang; Tingrui Zhang; Shimei Xu; Yazhou Zhu; Yongxue Lv; Yinqi Zhao; Mingxing Zhu; Lingna Guo; Wei Zhao
Journal:  Contrast Media Mol Imaging       Date:  2022-04-08       Impact factor: 3.009

10.  Bioinformatic analysis of a novel Echinococcus granulosus nuclear receptor with two DNA binding domains.

Authors:  Gabriela Alvite; Ximena Riera; Saira Cancela; Margot Paulino; Adriana Esteves
Journal:  PLoS One       Date:  2019-11-11       Impact factor: 3.240

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.