Literature DB >> 24929829

Genome and transcriptome of the porcine whipworm Trichuris suis.

Aaron R Jex1, Peter Nejsum2, Erich M Schwarz3, Li Hu4, Neil D Young1, Ross S Hall1, Pasi K Korhonen1, Shengguang Liao4, Stig Thamsborg2, Jinquan Xia4, Pengwei Xu4, Shaowei Wang4, Jean-Pierre Y Scheerlinck1, Andreas Hofmann5, Paul W Sternberg6, Jun Wang7, Robin B Gasser1.   

Abstract

Trichuris (whipworm) infects 1 billion people worldwide and causes a disease (trichuriasis) that results in major socioeconomic losses in both humans and pigs. Trichuriasis relates to an inflammation of the large intestine manifested in bloody diarrhea, and chronic disease can cause malnourishment and stunting in children. Paradoxically, Trichuris of pigs has shown substantial promise as a treatment for human autoimmune disorders, including inflammatory bowel disease (IBD) and multiple sclerosis. Here we report whole-genome sequencing at ∼140-fold coverage of adult male and female T. suis and ∼80-Mb draft assemblies. We explore stage-, sex- and tissue-specific transcription of mRNAs and small noncoding RNAs.

Entities:  

Mesh:

Year:  2014        PMID: 24929829      PMCID: PMC4105696          DOI: 10.1038/ng.3012

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


Main

Soil-transmitted helminths, including whipworm (Trichuris), hookworms (Necator and Ancylostoma) and the large roundworm (Ascaris), are among the most prevalent and devastating parasites of humans globally and predominate in impoverished nations[1]. Trichuris infects 1 billion people, and chronic infection of high intensity can lead to typhlitis, colitis, chronic dysentery and malnutrition through malabsorption as well as reduced physical and cognitive development[2]. Consequently, trichuriasis, which disproportionately affects children, has an estimated global burden of 1 million–6.5 million disability-adjusted life years, exceeding that of schistosomiasis, trachomiasis, trypanosomiasis or leishmaniasis[1]. Despite this, Trichuris species are classified by the World Health Organization as neglected parasites in urgent need of improved control[3]. Contrasting with the substantial burden of trichuriasis and other neglected helminths is the observation that human populations in endemic countries tend to suffer from substantially fewer immunopathological diseases[4], which are common and increasingly prevalent[5] in countries in which exposure to pathogens is limited. These observations have inspired the 'hygiene hypothesis'[6], which proposes that a lack of exposure of humans to common pathogens impairs immune function and leads to increased autoimmune disease. This hypothesis is supported by clinical data, with routine deworming positively[7] and early-childhood helminth infection negatively[8] correlating with autoimmune disorders. Recent studies have shown that porcine Trichuris (T. suis) administered to humans suffering from IBD (including Crohn's disease and ulcerative colitis) can reduce clinical symptoms[9,10]. Similar observations have been made in patients with multiple sclerosis[11]. Although helminths can alter immune responses in their hosts via a variety of excretory-secretory (ES) molecules[12], the specific interactions between T. suis and its host remain unclear. By sequencing the T. suis genome and transcriptomes (mRNAs and small RNAs), we provide deep insights into the molecular biology of this parasite and its modulation of host immune responses. These data provide a solid basis for exploring human trichuriasis, developing new anti-parasitic drugs and elucidating how helminths suppress autoimmune disorders.

Results

Sequencing, assembly and synteny

We sequenced the genomes of single adult female and male T. suis at ∼140-fold coverage, producing draft assemblies of 76 and 81 Mb, respectively (Table 1, Supplementary Figs. 1 and 2 and Supplementary Tables 1 and 2). Matches to conserved eukaryotic genes indicated that each assembly is 96% complete, with minimal redundancy (Supplementary Table 3). Alignment of these assemblies showed high similarity, with 68 Mb aligning as direct one-to-one matches in blocks of a mean length of 2.5 kb. Overall, sequence identity was 99.2% (38,854 SNPs). Despite the reported XX and XY karyotypes for female and male Trichuris, respectively[13], we found no evidence for a Y chromosome among the male-specific scaffolds, suggesting that this chromosome contains largely repetitive sequences common to both sexes; this finding is consistent with observations made of the T. suis karyotype, suggesting that the sex chromosomes were the smallest chromosomal pair and were morphologically very similar in both sexes[13]. Repetitive sequences comprise ∼32% of the genome, including 8% DNA transposable elements, 2.9% long tandem repeats and 3.3% retrotransposons (Supplementary Table 4). Each genome encoded ∼1,000 transfer RNA (tRNA) genes, with copy numbers reflecting codon usage in protein-encoding regions (Supplementary Fig. 3 and Supplementary Tables 5 and 6).
Table 1

Features of the scaffolded assembly of the adult male and female T. suis genomes

Male genomeFemale genome
k-mer (17 nucleotides) estimated genome size (in Mb)83.687.2
Total read data; estimated coverage11.73 Gb; 140×12.36 Gb; 142×
Total scaffolded assembly size (Mb); total scaffolds81.3; 60,85676.0; 42,663
Total scaffolds of >200 bp: length (Mb); no. of scaffolds74.2; 4,29371.0; 3,288
Largest scaffold (Mb)1.591.44
N50 in kb (scaffolds >200 bp)a500440
N90 in kb (scaffolds >200 bp); total number for >N90a81.0; 185104; 168
% GC content, whole genome; scaffolds >200 bp)43.9; 43.643.6; 43.5
% repetitive sequence (scaffolds >200 bp)31.732.3
Proportion of the genome that is coding, exonic; including introns26.2; 66.229.2; 73.4
Number of protein-encoding genes14,78114,470
Mean gene length (kb)3.73.9
Mean number of exons per gene5.45.7
Mean exon length (bp)271270
Mean number of introns per gene4.44.7
Mean intron length (bp)511509
Number of transfer RNAs9911,021

aN50 and N90 denote that 50% and 90% of assembly, respectively, is represented by scaffolds of at least this size.

Features of the scaffolded assembly of the adult male and female T. suis genomes aN50 and N90 denote that 50% and 90% of assembly, respectively, is represented by scaffolds of at least this size.

Protein-encoding gene set

The female and male T. suis genomes encode at least 14,470 and 14,781 protein-encoding genes, respectively, representing ∼70% of each genome, including introns and exons. We identified 14,356 and 14,315 female and male genes with an ortholog or homolog in the opposite sex, with 10,403 genes being defined as unambiguous one-to-one orthologs. Evidence for sex-specific genes was limited, with just one and 41 supported as female and male specific, respectively (see Supplementary Note). Of these sex-specific genes, only three male genes have a predicted function, having homology to C. elegans frk-1 (encoding a receptor tyrosine-kinase), gpc-1 (encoding a G protein-coupled receptor (GPCR)) and his-66 (encoding a histone protein), respectively. The sex-specific genes show no clustering among scaffolds, providing little evidence for their association with the sex chromosomes. Most of the remaining differences in the genes of the two genders relate to a higher copy number of some genes in the male. From both assemblies, we defined a unified set of 14,820 genes for T. suis (Table 1 and Supplementary Table 7), with 12,910 (87.1%) supported by high-throughput RNA sequencing (RNA-seq) data. The majority (59.8%) of these genes have homologs (BLASTp cut-off: 1 × 10−5) in other nematodes, including 6,286 (42.4%), 6,340 (42.7%), 6,149 (41.5%) and 8,480 (57.2%) in Ascaris suum[14], Brugia malayi[15], Caenorhabditis elegans[16] and Trichinella spiralis[17], respectively (Fig. 1). Functions were assigned to 9,342 (63.0%) protein-encoding genes (Supplementary Tables 8, 9, 10, 11, 12, 13). Focusing on key functional or druggable proteins[14], we predicted 653 peptidases and 288 peptidase inhibitors. Peptidase classes S1 (116) and S8 (42) are expanded in T. suis compared with those represented in other nematode genomes[14,15,16,17]. The T. suis genome also encodes 269 phosphatases and 232 kinases. We identified a large complement of receptors and transporters[18]; these molecules include 228 GPCRs, as well as 1,962 channel, pore and transporter proteins. Among the last group are 133 peroxisomal protein importers, more than in A. suum (n = 74) (ref. 14), which suggests a greater importance of fatty-acid digestion and metabolism in T. suis.
Figure 1

Homologs shared between T. suis (class Enoplea, order Trichocephalida) and related nematode species.

Homologs shared between T. suis (class Enoplea, order Trichocephalida) and related nematode species. We predicted 618 canonical ES proteins in adult T. suis (Supplementary Tables 14 and 15), including 165 proteases, many of which might have a role in disrupting intestinal epithelial cells in the host[19,20] and in the formation of the syncytial tunnel around the Trichuris stichosome[21]. Notable among these molecules are 33 chymotrypsin-like serine proteases, which have key roles in helminths associated with host invasion[22], immunosuppression[23] and tissue destruction[24]. In addition to proteases, non–membrane-bound transporters comprise a major component of the secretome. These transporters include 41 pore-forming toxins (porins), 25 of which have homology to the Trichuris trichiura porin TT47, which induces ion-conducting pores in planar lipid bilayers and assists in the formation of the syncytial tunnel in the intestinal epithelium[25]. Helminth-mediated immunomodulation by ES products is well documented[12]. Among the predicted T. suis ES proteins, we found a variety of immunomodulators (Supplementary Table 16). On the basis of these findings and available literature for helminths[12,26,27], we propose a Trichuris-driven immunomodulation model (Supplementary Fig. 4), in which the parasite suppresses inflammation by secreting (i) serpins to inhibit neutrophil cathepsins and elastases; (ii) apyrases to prevent conversion of regulatory T cells to pro-inflammatory T cells; (iii) cystatins to promote anti-inflammatory (producing interleukin-4 (IL-4) and IL-10) T cells by disrupting antigen presentation by dendritic and B cells; (iv) calreticulins that bind to dendritic cells and stimulate IL-4 production and limit inflammation by binding free calcium ions; and (v) molecular mimics[12] of host galectins, mammalian macrophage inhibitory factor and tumor growth factor-β that stimulate apoptosis in activated T cells, promote alternative activation of macrophages and block the stimulation of (proinflammatory) Toll-like receptor pathways. This model is consistent with the pathophysiology described for Trichuris infection[26], and probably operates in tandem with immunosuppressive processes linked to glycans[27] and lipids (for example, sphingolipids; see Supplementary Note).

Transcriptome and differential transcription

We explored stage-, sex- and tissue-specific transcription (mRNAs), with a focus on parasite-host interactions. We predicted 36,763 transcripts, with 15,174 of them being perfect matches to the intron/exon chain (excluding UTRs), as annotated in the genome, and 21,589 representing novel splice isoforms (Supplementary Fig. 5 and Supplementary Table 17). In total, 6,293 (43.7%) T. suis genes were predicted to encode at least two isoforms. The number of splice-isoforms per gene only moderately correlated with exon number (R2 = 0.44; Supplementary Figs. 6 and 7). Alternative splicing correlated with gene function, reflected in an enrichment (P ≤ 0.05; Pearson's chi-squared analysis) of protein catabolism (i.e., proteases), membrane-bound ion transport and kinase activity among alternatively spliced genes and apoptosis among single splice-isoform genes. Genes with multiple and single splice isoforms also differed in their conservation and predicted essentiality. Of the 6,307 T. suis genes with C. elegans homologs, alternatively spliced genes predominated by a three-to-one margin (4,510 versus 1,875) and represented 204 (75.8%) of the 269 essential genes predicted. The latter observation needs to be considered when mining helminth genomes for novel drug targets. If a novel inhibitor targeting products of such genes interacts with one of the spliced domains, isoform switching may be sufficient to overcome its effect. Some protein domains were significantly associated (P ≤ 0.05; Pearson's chi-squared analysis) with specific, alternative splice events, with exon skipping and the use of alternative first or last exons appearing to differ in their functional implications for transcription (Fig. 2a). Most notable was an over-representation of substrate-binding motifs (for example, immunoglobin, EGF-like or DnaJ) for genes biased toward transcripts with skipped exons. Remodeling of binding-motif structure through alternative splicing affects binding specificity in other organisms[28], and we propose that exon skipping is important in regulating binding specificity of proteins in T. suis. Given the varied functions associated with alternative first- or last-exon splicing events (Fig. 2a), we hypothesize that these specific modifications might play a part in regulating protein localization, another known role for alternative splicing[28].
Figure 2

Stage- and tissue-specific small-RNA and mRNA transcriptome of T. suis.

(a) Association between gene function and alternative splice variation. Charts show the inferred function of protein domains encoded by genes showing a statistically significant (P ≤ 0.05; Pearson's chi-squared analysis) positive (+) or negative (−) bias toward skipped exon (SE), alternative first (FE) or last exon (LE) splice events. Only genes encoding ten or more transcripts are included in this analysis. (b) Proportional representation of major protein classes or groups encoded by the genome (Gen), and their proportional abundance in all transcriptomic data (All) and in larval (L1/2, L3 and L4), adult male (Am), female (Af) and tissue-specific libraries, including in the male (Mp) and female posterior body (Fp) and the stichosome (St). (c) Self-organizing heatmap (transcripts per million (TPM) values normalized by gene) clustering miRNAs by their transcription abundance (represented as log2-transformed reads per kilobase per million reads (RPKM) values) in each larval, adult and tissue-specific library. (d) Self-organizing heatmap (TPM values normalized by gene) clustering 22A-RNAs by their transcription abundance (represented as log2-transformed RPKM values) in each larval, adult and tissue-specific library.

Stage- and tissue-specific small-RNA and mRNA transcriptome of T. suis.

(a) Association between gene function and alternative splice variation. Charts show the inferred function of protein domains encoded by genes showing a statistically significant (P ≤ 0.05; Pearson's chi-squared analysis) positive (+) or negative (−) bias toward skipped exon (SE), alternative first (FE) or last exon (LE) splice events. Only genes encoding ten or more transcripts are included in this analysis. (b) Proportional representation of major protein classes or groups encoded by the genome (Gen), and their proportional abundance in all transcriptomic data (All) and in larval (L1/2, L3 and L4), adult male (Am), female (Af) and tissue-specific libraries, including in the male (Mp) and female posterior body (Fp) and the stichosome (St). (c) Self-organizing heatmap (transcripts per million (TPM) values normalized by gene) clustering miRNAs by their transcription abundance (represented as log2-transformed reads per kilobase per million reads (RPKM) values) in each larval, adult and tissue-specific library. (d) Self-organizing heatmap (TPM values normalized by gene) clustering 22A-RNAs by their transcription abundance (represented as log2-transformed RPKM values) in each larval, adult and tissue-specific library. T. suis undergoes substantial developmental changes throughout its direct life cycle[29]. To understand developmental processes in this parasite, we used RNA-seq to characterize transcription in various stages, sexes or body portions (stichosomal versus all non-stichosomal tissues from male and female adult worms): first and second (L1/L2), third (L3) and fourth (L4) larval stages adult male and female; stichosome and adult male posterior body and female posterior body excluding the stichosome (Supplementary Fig. 8 and Supplementary Table 18). Overall, a number of major functional classes of proteins showed higher representation in the transcriptome of T. suis than in the genome (Fig. 2b). Secretory proteins were notable in this regard, making up ∼4% of the T. suis gene set but representing ∼10% of the transcriptional abundance in all libraries. Peptidases, particularly secreted peptidases, were also over-represented in the transcriptome and, notably, were upregulated during larval development and in the stichosome. The stichosome is the thin, elongate anterior end of Trichuris embedded tightly within a syncytial tunnel[21] in the superficial layer of the large intestinal mucosa. Within this tunnel, the parasite secretes proteins and other molecules and absorbs nutrients from cell cytoplasm and surrounding tissue fluids, probably through thousands of bacillary cells[30]. Given its central importance in feeding and interaction with the host, we focused on transcription in the stichosome relative to the rest of the worm body (Supplementary Table 18; see Supplementary Note for detailed comparisons among other stages or tissues). Transcription was enriched for 2,210 genes (encoding 3,721 transcripts) in the stichosome relative to both the male and female posterior bodies (Supplementary Table 18). Among these genes are 160 peptidases (encoding 256 transcripts) and 41 porins (85 transcripts), supporting their role in host-tissue degradation and syncytial tunnel formation[19,25] (Supplementary Fig. 9). Also notable is the enrichment of a large number of secreted and membrane-bound transporters (222 genes encoding 371 transcripts) of various ions (for example, sodium, phosphate and calcium) and small molecules (for example, glucose and nucleosides). Sugar metabolism is enriched in the stichosome, suggesting that absorbed glucose is rapidly metabolized in the stichocytes. Also upregulated in the stichosome are transcripts associated with endocytosis and vesicle formation, lysozyme and peroxisome pathways as well as fatty acid and amino acid (cysteine and methionine; lysine) degradation. At least one isoform of each putative immunomodulatory gene encoded by T. suis is transcribed in the stichosome, with 22 transcripts encoding galactins, serpins, venom allergen–like proteins, apyrase or calreticulin specifically enriched in the stichosome relative to both the male and female posterior bodies. Chymotrypsin-like (S1) serine proteases (n = 28 of 31 genes, and 51 of 135 transcripts) are also upregulated in the stichosome. Many are homologs of vertebrate plasmin, which is thought to regulate blood clotting in the host[31]. A poorly understood consequence of trichuriasis is bloody diarrhea[32], and some evidence suggests that Trichuris ingests blood[33]. It may be that some T. suis chymotrypsin-like serine proteases act as anticoagulants or assist in digesting blood, serum and tissue components (for example, fibrinogen). Notably, T. muris infection alters the mucus barrier in the host's gut epithelium, leading to an increased susceptibility to nematode infections[34] through the degradation of mucin 2 (Muc2) polymers[35]. Muc2 depolymerization by T. muris is blocked by chymostatin and antipain[35], suggesting a probable role for chymotrypsin-like and other serine proteases. Several of the secreted serine proteases enriched in the stichosome are homologs of Schistosoma mansoni serine protease 1 (SP1) and human kallikrein. The latter molecule regulates the degradation of kininogen to bradykinin, stimulating vasodilation, the cytosolic release of Ca2+, neutrophil recruitment and increased inflammation[36]. SP1 is a potent vasodilator in mice[37], suggesting that it has an ability to convert vertebrate kininogen to bradykinin. Given the anti-inflammatory capacity of T. suis, we propose that some of these chymotrypsin-like serine proteases might degrade host kininogen but do not enable bradykinin production, thereby preventing bradykinin receptor stimulation and, thus, inhibiting inflammation. Notably, bradykinin receptors have key roles in various autoimmune disorders, including IBD[38] and multiple sclerosis[36].

Genetic regulatory networks

The T. suis gene set has complete RNA-interference machinery, suggesting potential for functional genomic studies and indicating a role for small noncoding RNAs in gene regulation. We explored these small RNAs in T. suis (Supplementary Figs. 10 and 11 and Supplementary Tables 19, 20, 21) and produced ∼435 million sequence reads. Approximately 92% of these reads mapped to the T. suis genome, with 16%, 23% and 9% classified as microRNAs (miRNAs), small interfering RNAs (siRNAs) and tiny noncoding RNAs (tncRNAs), respectively. Approximately 4% of the small-RNA reads mapped with an antisense (>80% of reads) bias to transposable elements, consistent with Piwi-interacting RNAs (piRNAs)[39]. However, similarly to small RNAs in Ascaris suum[40], < 0.01% had characteristics consistent with 21U-RNAs, which function as piRNAs in C. elegans[41]. We identified 319 miRNAs, with 132 having close homologs in other nematodes (Supplementary Table 22). These miRNAs accounted for 16% of all small-RNA reads sequenced, with tsu-let-7 (50% of all miRNA reads), tsu-miR-1 (17%), tsu-novel-51 (8%; a homolog of tsp-novel-51 miRNA from T. spiralis) and tsu-miR-228 (4%) the most highly transcribed. Approximately two-thirds of the miRNAs were most abundant in larval stages, suggesting a central role in development, with a diminishing number of miRNAs enriched in adults (Fig. 2c). This trend was reversed in the transition from L4 to adult female. To explore the functional implications of differential transcription of these miRNAs, we predicted miRNA-binding sites linked to 3′ UTRs among 22,954 of the 23,824 mRNA isoforms (representing 7,180 genes) for which at least part of the 3′ UTR could be identified on the basis of RNA-seq data. We focused on miRNA-mRNA interactions recognized or proposed for C. elegans. Of the 785,143 predicted binding sites with homology to C. elegans (both miRNA and mRNA), 300,042 were supported by information in public databases and 3,238 by experimental findings[42]. Owing to differences in gene copy number, these 300,042 binding sites represented 45 and 62 miRNAs as well as 3,205 and 3,877 coding genes in C. elegans and T. suis, respectively. For T. suis, the shift from L3 to L4 coincides with a universal downregulation of 24 of these conserved miRNAs, including tsu-miR-1, tsu-miR-252 and tsu-miR-236—the second, seventh and eighth most abundant miRNAs, respectively, in T. suis overall. We identified 69 transcripts enriched in L4 (relating to 62 T. suis genes and 61 C. elegans homologs) with binding sites for each of these miRNAs. Many of the C. elegans genes with inferred homology to these transcripts are involved in larval or embryonic development (for example, rol-3, slt-1 and sox-3), growth (for example, egl-4, unc-44 and lin-39) or early sexual determination (for example, sex-1; WormBase), suggesting similar functional roles in T. suis. The maturation of T. suis to adulthood coincides with a variety of sex-specific changes in miRNA levels (relative to L4s). In both sexes, tsu-miR-228 (the fourth most abundantly transcribed miRNA in T. suis) and several isoforms of tsu-miR-61 were downregulated, and tsu-miR-34 and two 'minor' miRNAs—tsu-miR-256 and tsu-miR-50—were upregulated. Many of the coding genes (n = 447) upregulated in male and female adults are predicted to be co-regulated by tsu-miR-61 and tsu-miR-228. Homologs of these coding genes in C. elegans are enriched in GO terms (biological process) for embryonic and genital development, reproduction, morphogenesis and growth and metabolism, and include tbx-2, vps-16, xnp-1 and dyci-1 (WormBase). Notable among predicted tsu-miR-34–regulated genes were homologs of srp-2 (encoding serpin-2, an anti-inflammatory protein in helminths)[12], which is downregulated in the adult worm (with the exception of the stichosome) compared with larval stages. When we compared male and female T. suis adults, we found that major differences in miRNA transcription also related to tsu-miR-61, tsu-miR-228, tsu-miR-236 and tsu-miR-252, highlighting their importance in this nematode. Enriched in males are tsu-miR-228 and four copies of tsu-miR-61, and in females, tsu-miR-236, tsu-miR-252 and one copy of tsu-miR-61 (with closest homology to cel-miR-61-5p). Considering the ambiguity associated with the enrichment of different tsu-miR-61 isoforms in both males and females, we focused on tsu-miR-228, tsu-miR-236 and tsu-miR-252. In males, the enrichment of tsu-miR-228 coincides with a downregulation of 412 transcripts (representing 320 T. suis genes) with a predicted tsu-miR-228 binding site. On the basis of their function in C. elegans homologs, we infer many of these transcripts to be linked to vulva development (for example, exc-4, mys-1, nekl-2 and sem-4), egg production (for example, cbd-1, nsy-1, ppt-1, unc-29 and unc-58) and embryogenesis and germline development (for example, bcat-1, lars-1, rnp-4, rpt-5, slt-1 and tbp-1). In females, enriched transcription of tsu-miR-236 and tsu-miR-252 coincides with a downregulation of 262 transcripts (representing 205 genes) predicted to be co-regulated by these miRNAs. Homologs of these 'female-suppressed' coding genes in C. elegans include genes involved in spermatogenesis (for example, cogc-5 and cpb-1), male mating or fertility (for example, goa-1 and odc-1), the regulation of germline specification or apoptosis (for example, glp-1, him-1, rpt-5, let-60, vps-16 and vps-41) and chemosensation (for example, crh-1, grk-2 and lys-2). Collectively, these data suggest that sexual dimorphism in T. suis might relate, at least partially, to post-transcriptional sex suppression by miRNAs rather than exclusive transcriptional promotion by mRNAs. In addition to miRNAs, we identified 1,028,808 putative small RNAs mapping to coding regions of the genome. Of these RNAs, 673,355 mapped antisense (≥80% of reads at each location) to exons, suggesting a potential role as siRNAs[41]. Most abundant among the siRNAs predicted for T. suis (except those derived from males) were sequences of 24–25 nt with a 5′ guanine (i.e., 24G and 25G), compared with 22G and 26G sequences predominating among siRNAs predicted for other nematodes to date[40,41]. Putative siRNAs were predicted for 3,497 protein-encoding genes. Many siRNAs have key roles in germline tissues[40,41]. In T. suis, we identified transcripts for 508 coding genes, for which putative siRNAs were uniquely transcribed in the adult female and the female posterior body relative to the stichosome, the adult male and the male posterior body (Supplementary Tables 7 and 18; see URLs). These coding genes were enriched for transposable elements/transposases, histones or histone methytransferases, DNA- or RNA-binding, chromatin folding and homeodomain-related proteins. Similarly, but in lower abundance, these functions were also enriched in relation to the 69 coding genes associated with siRNAs uniquely transcribed in the adult male and the male posterior body. We hypothesize that these highly transcribed siRNAs protect chromatin in the T. suis germline, and this hypothesis is supported by the observation that 162 of the female-enriched siRNAs are absent from the larval stages studied here (Supplementary Tables 7 and 18).

Novel class of tncRNAs

Conspicuous among the T. suis small RNAs is an abundance of 22-nt sequences with a 5′ adenine cap. Although representing just 2.9% (n = 58,307) of consensus small RNAs, these sequences represent 9.2% of all small-RNA transcription. By location, 22-nt 5′-adenylated sequences are evenly distributed between coding and noncoding regions, and within noncoding regions, between annotated (such as transposable elements, tRNAs or other noncoding RNAs) and un-annotated spaces. However, 89% of transcription attributed to these sequences relates to un-annotated, noncoding space in the genome. On the basis of their size and abundance, these sequences are consistent with tncRNAs[43]; however, they have characteristics not previously attributed to this class. For instance, in T. suis, they have a clear strand bias, with 83% of their transcription occurring on the Watson (i.e., antisense) strand. These 'antisense'-biased tncRNAs (henceforth called 22A-RNAs) form 1,208 clusters (ranging from 22 to 11,831 nt) among 238 assembly scaffolds, with a median of three 22A-RNA sequences per cluster at a median spacing of 97 bp. Although 40% of multicopy 22A-RNA sequences are found in the same cluster, clusters comprising one repeated 22A-RNA sequence are rare, and sequences are often shared among clusters and genomic scaffolds, indicating that tandem duplication is not the only mechanism associated with cluster formation. Few transposable elements are found within 25 kb of these clusters, suggesting that their insertion or translocation within the genome is not recent. At this stage, we can only speculate about the function(s) and mechanism(s) of action of these sequences. Their 100-nt genomic neighborhoods vary: some regions resemble (but do not overlap with) known protein-encoding sequences, others resemble known noncoding RNAs such as tRNAs, and still others resemble neither. Eleven of these neighborhoods show partial similarities to cryptic tRNAs, ∼100 nt in size, discovered in C. elegans by the modENCODE Consortium[44]. The sequences of the 22A-RNAs themselves are also heterogeneous, with only one over-represented 8-nt sequence motif (5′-A[CA]GATAT[GT]-3′) occurring in 4.5% (245 of 5,457) of 22A-RNA sequences (Supplementary Fig. 12). Given these findings, we propose that 22A-RNAs may be processed from larger noncoding RNAs of diverse types, some of which are highly conserved and familiar, others of which are both hypothetical and unfamiliar. Despite having no obvious promoter motif (such as that proposed for 21U-RNAs)[41], these sequences seem to be transcriptionally regulated, and their abundance varies substantially among stages, sexes and tissues in T. suis; including, notably, an enrichment in the adult male body and male posterior body relative to all other stages and tissues (Fig. 2d), which may suggest a role in the male germline. As a proportion of overall small RNA transcription, 22A-RNAs are most abundant in the stichosome, wherein they comprise ∼22% of all small-RNA reads determined. Indeed, the stichosome is notably restricted in its classes of small RNAs, with miRNAs (39% of all small-RNA reads from the stichosome) and 22A-RNAs dominating the small-RNA population in this organ. Whether this finding points to an involvement of these novel T. suis noncoding RNAs in host interactions deserves detailed investigation.

Discussion

Globally, helminthiases are seriously neglected causes of morbidity and mortality. Genomic and transcriptomic explorations of T. suis should enable the design of urgently needed therapeutics against human trichuriasis, one of the world's most important and neglected helminthiases. An intriguing feature of T. suis is its possible use as a therapy for human autoimmune disorders[9,10,11]. A detailed characterization of how this parasite modulates the host immune response is thus a key priority. Secreted proteins (including cystatins and serpins, thioredoxin peroxidase and various putative mimics of host proteins) seem to have a central role in this process, primarily through inhibiting inflammation. Our findings indicate a role for parasite-derived lipids, including the inferred synthesis of β-glucosylceramide, a known anti-inflammatory and putative therapy for IBD[45], during the early developmental phase of T. suis (see Supplementary Note). It is likely that both proteins and lipids work in concert with N-linked glycans, which are known immunomodulators produced by T. suis[27], particularly L4 and adult stages, in which pathways associated with their synthesis are transcriptionally enriched (see Supplementary Note). The detailed characterization of these molecules in vitro and in vivo, using existing models of IBD and other autoimmune disorders, might pave the way for parasite-derived therapies[5]. Indeed, a better understanding of the T. suis–host interactions might shed new light on why helminth exposure seems crucial for the development of a healthy immune system in humans. This is the first study to characterize the genomes of male and female individuals of a dioecious nematode. We found little evidence for sex-specific genes or assembly contigs, despite the reported XY karyotype of this species. However, intriguingly, miRNAs seem to have a major role in regulating sexual development in this species, with tsu-miR-228 in male, and tsu-miR-236 and tsu-miR-252 in female worms predicted to regulate and suppress key feminizing and masculinizing developmental genes, respectively. This is the first time that this has been observed for a metazoan.

Methods

Sample preparation and storage.

Trichuris suis were isolated from experimentally infected pigs inoculated orally with a single dose of 5,000–50,000 embryonated eggs (Animal Ethics Permission No. 2010/561-1914; University of Copenhagen). Individuals of T. suis were isolated at 10 (L1/L2 larvae), 18 (L3s), 28 (L4s) and 49 (adulthood) d after inoculation (p.i.)[29,46] and washed in physiological saline (37 °C) and RPMI 1640 (GIBCO) with antibiotic-antimycotic (GIBCO). Adult male and female T. suis were separated. Stichosomes were excised from whole adult worms (n = 10, irrespective of sex), pooled and frozen, as were the posterior portions of the worms (n = 10 of each sex). All stages or tissues were snap frozen in liquid nitrogen and stored at −80 °C.

Genomic sequencing and assembly.

Total genomic DNA was isolated each from a single adult male or female T. suis[47,48]. Paired-end (insert sizes, 170 bp and 500 bp) and mate-paired (800 bp, 2 kb, 5 kb and 10 kb) libraries were constructed from total and whole-genomic amplified (WGA) genomic DNA, respectively[14,49] and sequenced using a HiSeq 2000 machine (Illumina). Low-quality sequences, base-calling duplicates and adapters were removed using standard approaches. Sequence quality and heterozygosity were assessed by 17-mer frequency distribution[50] and genome sizes estimated[51]. Corrected and filtered data were assembled into contigs using SOAPdenovo v2.0 (ref. 50) and assessed for accuracy using SOAP2aligner[52]. Assembly completeness and redundancy were assessed using CEGMA[53] and RNA-seq data using Bowtie2 (ref. 54).

RNA isolation and RNA-seq.

Total RNAs from L1/L2 (n = 50,000 from five pigs), L3 (n = 15,000 from four pigs) and L4 (n = 3,000 from two pigs), adult male (n = 10), adult female (n = 10), and stichosomal (mixed sex; n = 10) and nonstichosomal portions of adult females (n = 10) and males (n = 10) were individually purified using TriPure reagent (Roche). Polyadenylated (polyA+) RNA was purified from 10 μg of total RNA for each library using Sera-mag oligo(dT) beads and fragmented, purified and sequenced using HiSeq 2000 (refs. 14,49). Small noncoding RNAs (∼18–30 nt) were isolated from 10 μg of total RNA for each library by size fractionation on polyacrylamide gels, purified, adaptor-ligated, reverse transcribed, amplified by PCR and sequenced using HiSeq 2000. All RNA-seq data were adaptor trimmed and length and quality filtered using standard approaches.

Synteny and polymorphism analysis, and annotation of repeat content.

For comparative analysis, the assemblies for adult male or female T. suis were aligned using MUMmer3 (ref. 55). Repetitive sequences in each assembly were identified[14] using Tandem Repeats Finder (TRF)[56], RepeatMasker[57], LTR_FINDER[58], PILER[59] and RepeatScout[60], with a consensus population of predicted repetitive elements constructed in RepeatScout using fit-preferred alignment scores. Transfer RNAs were predicted using tRNA-SCAN[61]. The male assembly was explored for scaffolds likely to represent the male-specific Y chromosome[13,62]. Reads from all genomic sequence libraries each for male and female T. suis were aligned to their own and the opposite sex (both repeat unmasked and hard-masked) assembly (i.e., male-to-male, male-to-female, female-to-male and female-to-female) using Bowtie2 (ref. 54). Contigs with >80% coverage in same-sex but <20% coverage in opposite-sex read alignments were deemed 'sex-specific'.

Prediction and functional annotation of the protein-encoding gene set.

The male and female protein-encoding gene set of T. suis was inferred in MAKER2 (ref. 63). Briefly, (i) the nonredundant T. suis transcriptome was aligned each assembly using BLAT[64] and filtered for full-length ORFs, which were used (ii) to train hidden Markov models (HMM) for de novo gene prediction using SNAP[65] and AUGUSTUS[66], with these models supplemented using (iii) homologous genes from T. spiralis[17] and C. elegans[16]; and (iv) all T. suis RNA-seq data from all libraries used to infer each transcript using Tophat2 (ref. 67) and Cufflinks2 (refs. 68,69); (v) all HMM-predicted, homology and evidence-based information was then combined into a single consensus gene set, and (vi) genes overlapping with predicted repetitive regions of the genome and/or having significant E < 1 × 10-5, BLASTn homology to known repetitive sequences (i.e., transposable elements) in RepBase[57] and no close homology to C. elegans or T. spiralis protein-encoding genes were removed. The male and female T. suis gene sets were unified by orthology prediction using InParanoid[70], with T. spiralis[17] as an out-group. Conserved protein domains encoded by each gene were identified using InterProScan[71], with these data used to infer Gene Ontology[72]. Using Reciprocal BLASTp and OrthoMCL[73], the T. suis inferred proteome was clustered with predicted homologs or orthologs for other nematodes, including Ascaris suum[14], Brugia malayi[15], C. elegans[16] and T. spiralis[17]. Each contig was assessed for a known functional ortholog in the Kyoto Encyclopedia of Genes and Genomes (KEGG) using the KEGG orthology bases annotation system (kobas)[74]. In addition, T. suis inferred proteins were compared by BLASTx/BLASTp with protein sequences available for A. suum, B. malayi, C. elegans and T. spiralis, and in the databases UniProt[75], SwissProt and TREMBL[76], as well as specialist databases for key protein groups represented in MEROPS[77], WormBase[78], KS-SARfari and GPCR-SARfari, and the Transporter Classification database (TCDB)[79]. ES proteins were predicted using Phobius[80] and by BLASTp comparison with the validated signal peptide database (SPD)[81] and proteomic data for the nematodes B. malayi[82,83] and Meloidogyne incognita[84] and the trematode Schistosoma mansoni[85].

Differential transcription analysis of mRNA.

Reconstruction and quantification (in fragments per kilobase per million reads (FPKMs)) of the T. suis transcriptome was conducted using TopHat2 (ref. 67) and Cufflinks2 (refs. 68,69). Predicted alternative splice events were classified[86]. Comparisons of splice events and gene function (based on encoded Pfam domains) were conducted by pairwise Pearson chi-squared analysis (P value ≤ 0.05). We also compared the relationship between gene essentiality and being a single or multi-isoform gene, with essentiality predicted[14]. Differential transcription was assessed using NOISeq[87], with 20% of the evaluated reads for each library used in five iterations to simulate technical replicates.

Annotation and differential transcription analysis of small noncoding RNAs.

Canonical miRNAs were identified and quantified in miRDeep2 (refs. 88, 89, 90), and supported using miRNAs published for A. suum[40], C. elegans[91], Haemonchus contortus[92], Brugia pahangi[92] and T. spiralis[93]. The 3′ UTR for each Cufflinks-predicted T. suis transcript was identified by comparison with the T. suis genome annotation. Each 3′ UTR was screened for miRNA binding sites using PITA[94]. These binding sites were filtered on the basis of homologous miRNA-transcript binding interactions predicted for C. elegans in curated databases (microRNA.org, RNA22 and TargetScan) or demonstrated empirically[42]. All non-miRNA reads from each small-RNA library for T. suis were then aligned to the male T. suis genome using Bowtie2 (ref. 54) and clustered using ShortStack[95], with a minimum cluster depth cutoff of 10. Small-RNA reads having perfect alignment overlap (i.e., the same start and stop position) were defined as homologous and condensed into a consensus sequence by majority rule. Each consensus small RNA was classified[41] and nucleotide diversity within homologous small-RNA reads was assessed using custom Perl scripts. Specific small-RNA sequences (for example, 21U-RNAs)[41] and their 5′ and 3′ flanking regions were explored for sequence motifs using MEME[96]. Differential transcription among stage- or tissue-specific libraries was assessed (in reads per million mapped reads, RPKM) for miRNAs, siRNAs and 22A-RNAs using NOISeq[87], and clustered by stage- or tissue-specific transcription pattern using R.

Analysis of the genomic neighborhoods and primary sequences of 22A-RNAs.

We extracted 22A-RNAs with 100-nt flanks lacking scaffolding (N) residues, merged those with spatial overlaps along the genome, and further condensed them to 80% sequence identity with CD-HIT-EST[97]. We probed resultant nonredundant 22A-RNA regions for protein-encoding exons with BlastX[79], and known noncoding RNAs (ncRNAs) with INFERNAL 1.1/cmscan[98]. BlastX was run against predicted proteomes from all published nematode genomes in WormBase WS240 (ref. 87), as well as the T. suis male proteome from this study; cmscan was run against the ncRNA database RFAM 11.0 (ref. 99). Regions passing these filters were tested for similarity to (i) genomic DNA from other nematode species, (ii) other 22A-RNA neighborhoods and (iii) novel ncRNAs from C. elegans. The first was assayed by BlastN against published nematode genomic sequences in WormBase WS240 (ref. 87). The second was assayed by BlastN against 22A-RNA regions spatially merged for genomic overlaps but not condensed with CD-HIT-EST. The third was assayed by BlastN against a set of 8,126 C. elegans ncRNAs taken from the WS240 release of WormBase[87]. All searches used E-value thresholds of ≤10−3. A set of 25,259 C. elegans ncRNAs was obtained from WormBase WS240. We filtered out those named 'asRNA', 'rRNA', 'scRNA', 'snoRNA', 'snRNA' or 'tRNA', leaving 8,126 ncRNAs with no official similarity to well-known structures. Most of these ncRNAs had been discovered by modENCODE[45]; 176 others were long noncoding RNAs[100]. We then checked for previously undescribed motifs via RFAM 11.0 and cmscan. To discover whether 22A-RNA sequences contained novel motifs, we extracted 22A-RNA sequences without flanking protein-encoding or ncRNA similarities, merged them spatially and for 80% identity, and scanned with MEME[96], using a first-order Markov model from the adult male T. suis genome (via MEME's fasta-get-markov). We ran MEME with arguments: '-dna -revcomp -nsites 100 -bfile TS_M_200bpormore.1markov.txt -nmotifs 10 -evt 0.05 -minw 6 -maxw 22 -mod anr'. For one resulting 8-nt motif, we used FIMO[101] to determine where it occurred in original, unmerged 22A-RNA sequences, with arguments: '–bgfile TS_M_200bpormore.1markov.txt–output-pthresh 0.001'. The motif was displayed as a logarithmic WebLogo[102].

URLs.

WormBase, http://www.wormbase.org/; microRNA.org, http://www.microrna.org/microrna/home.do; RNA22, https://cm.jefferson.edu/rna22v1.0/; TargetScanWorm, http://www.targetscan.org/worm_52/; salient data files are accessible via http://gasser-research.vet.unimelb.edu.au/Trichuris_suis/ and ftp://ftp.wormbase.org/pub/wormbase/species/t_suis/; browsable male and female genomes are accessible via http://gasser-research.vet.unimelb.edu.au/jbrowse/JBrowse-1.11.2/index.html?data=TsuisMale/ and http://gasser-research.vet.unimelb.edu.au/jbrowse/JBrowse-1.11.2/index.html?data=TsuisFemale/, respectively, or through WormBase via ftp://ftp.wormbase.org/pub/wormbase/species/t_suis/.

Accession numbers.

All short-read data are available via Sequence Read Archive: SRR1041639, SRR1041640, SRR1041641, SRR1041642, SRR1041643, SRR1041644 (genomic DNA male); SRR1041645, SRR1041646, SRR1041647, SRR1041648, SRR1041649, SRR1041650 (genomic DNA female); SRR1041651, SRR1041652, SRR1041653, SRR1041654, SRR1041655, SRR1041656, SRR1041657, SRR1041658 (mRNA); SRR1041659, SRR1041660, SRR1041661, SRR1041662, SRR1041663, SRR1041664, SRR1041669, SRR1041670 (small RNA). Annotated assemblies of each genome are accessible via BioProject PRJNA208415 (male) and PRJNA208416 (female).
  100 in total

1.  The kinin system--bradykinin: biological effects and clinical implications. Multiple role of the kinin system--bradykinin.

Authors:  Ch Golias; A Charalabopoulos; D Stagikas; K Charalabopoulos; A Batistatou
Journal:  Hippokratia       Date:  2007-07       Impact factor: 0.471

2.  Probiotic helminth administration in relapsing-remitting multiple sclerosis: a phase 1 study.

Authors:  J O Fleming; A Isaak; J E Lee; C C Luzzio; M D Carrithers; T D Cook; A S Field; J Boland; Z Fabry
Journal:  Mult Scler       Date:  2011-03-03       Impact factor: 6.312

3.  Studies on the biology of the life-cycle of Trichuris suis Schrank, 1788.

Authors:  R J Beer
Journal:  Parasitology       Date:  1973-12       Impact factor: 3.234

4.  Adaptation of a nematode parasite to living within the mammalian epithelium.

Authors:  Lewis G Tilney; Patricia S Connelly; Gregory M Guild; Kelly A Vranich; David Artis
Journal:  J Exp Zool A Comp Exp Biol       Date:  2005-11-01

5.  Trichuris suis seems to be safe and possibly effective in the treatment of inflammatory bowel disease.

Authors:  Robert W Summers; David E Elliott; Khurram Qadir; Joseph F Urban; Robin Thompson; Joel V Weinstock
Journal:  Am J Gastroenterol       Date:  2003-09       Impact factor: 10.864

6.  Trichuris suis-induced modulation of human dendritic cell function is glycan-mediated.

Authors:  Elsenoor J Klaver; Loes M Kuijk; Lisa C Laan; Helene Kringel; Sandra J van Vliet; Gerd Bouma; Richard D Cummings; Georg Kraal; Irma van Die
Journal:  Int J Parasitol       Date:  2012-12-07       Impact factor: 3.981

7.  The draft genome of the parasitic nematode Trichinella spiralis.

Authors:  Makedonka Mitreva; Douglas P Jasmer; Dante S Zarlenga; Zhengyuan Wang; Sahar Abubucker; John Martin; Christina M Taylor; Yong Yin; Lucinda Fulton; Pat Minx; Shiaw-Pyng Yang; Wesley C Warren; Robert S Fulton; Veena Bhonagiri; Xu Zhang; Kym Hallsworth-Pepin; Sandra W Clifton; James P McCarter; Judith Appleton; Elaine R Mardis; Richard K Wilson
Journal:  Nat Genet       Date:  2011-02-20       Impact factor: 38.330

8.  The Universal Protein Resource (UniProt): an expanding universe of protein information.

Authors:  Cathy H Wu; Rolf Apweiler; Amos Bairoch; Darren A Natale; Winona C Barker; Brigitte Boeckmann; Serenella Ferro; Elisabeth Gasteiger; Hongzhan Huang; Rodrigo Lopez; Michele Magrane; Maria J Martin; Raja Mazumder; Claire O'Donovan; Nicole Redaschi; Baris Suzek
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  KOBAS server: a web-based platform for automated annotation and pathway identification.

Authors:  Jianmin Wu; Xizeng Mao; Tao Cai; Jingchu Luo; Liping Wei
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

10.  CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors:  Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

View more
  38 in total

1.  Trichuris suis soluble products induce Rab7b expression and limit TLR4 responses in human dendritic cells.

Authors:  E J Klaver; T C T M van der Pouw Kraan; L C Laan; H Kringel; R D Cummings; G Bouma; G Kraal; I van Die
Journal:  Genes Immun       Date:  2015-05-21       Impact factor: 2.676

Review 2.  Genome mining offers a new starting point for parasitology research.

Authors:  Zhiyue Lv; Zhongdao Wu; Limei Zhang; Pengyu Ji; Yifeng Cai; Shiqi Luo; Hongxi Wang; Hao Li
Journal:  Parasitol Res       Date:  2015-01-08       Impact factor: 2.289

3.  Panning for molecular gold in whipworm genomes.

Authors:  Elodie Ghedin
Journal:  Nat Genet       Date:  2014-07       Impact factor: 38.330

4.  Intestinal helminths as a biomolecular complex in archaeological research.

Authors:  Patrik G Flammer; Adrian L Smith
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2020-10-05       Impact factor: 6.237

Review 5.  Perusal of parasitic nematode 'omics in the post-genomic era.

Authors:  Jonathan D Stoltzfus; Adeiye A Pilgrim; De'Broski R Herbert
Journal:  Mol Biochem Parasitol       Date:  2016-11-22       Impact factor: 1.759

Review 6.  The genomic basis of nematode parasitism.

Authors:  Mark Viney
Journal:  Brief Funct Genomics       Date:  2018-01-01       Impact factor: 4.241

Review 7.  Epithelial sodium channel (ENaC) family: Phylogeny, structure-function, tissue distribution, and associated inherited diseases.

Authors:  Israel Hanukoglu; Aaron Hanukoglu
Journal:  Gene       Date:  2016-01-07       Impact factor: 3.688

8.  Comparison of RP-HPLC modes to analyse the N-glycome of the free-living nematode Pristionchus pacificus.

Authors:  Shi Yan; Iain B H Wilson; Katharina Paschinger
Journal:  Electrophoresis       Date:  2015-06       Impact factor: 3.535

9.  RNA interference in adult Ascaris suum--an opportunity for the development of a functional genomics platform that supports organism-, tissue- and cell-based biology in a nematode parasite.

Authors:  Ciaran J McCoy; Neil D Warnock; Louise E Atkinson; Erwan Atcheson; Richard J Martin; Alan P Robertson; Aaron G Maule; Nikki J Marks; Angela Mousley
Journal:  Int J Parasitol       Date:  2015-07-04       Impact factor: 3.981

10.  Sweet secrets of a therapeutic worm: mass-spectrometric N-glycomic analysis of Trichuris suis.

Authors:  Iain B H Wilson; Katharina Paschinger
Journal:  Anal Bioanal Chem       Date:  2015-12-09       Impact factor: 4.142

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.