| Literature DB >> 30201055 |
Orawan Phuphisut1, Pravech Ajawatanawong2, Yanin Limpanont3, Onrapak Reamtong4, Supaporn Nuamtanong1, Sumate Ampawong5, Salisa Chaimon1, Paron Dekumyoy1, Dorn Watthanakulpanich1, Brett E Swierczewski6, Poom Adisakwattana7.
Abstract
BACKGROUND: Schistosoma mekongi is one of five major causative agents of human schistosomiasis and is endemic to communities along the Mekong River in southern Lao People's Democratic Republic (Laos) and northern Cambodia. Sporadic cases of schistosomiasis have been reported in travelers and immigrants who have visited endemic areas. Schistosoma mekongi biology and molecular biology is poorly understood, and few S. mekongi gene and transcript sequences are available in public databases.Entities:
Keywords: Differentially-expressed genes; Gene expression; RNA-Seq; Schistosoma mekongi; Transcriptome
Mesh:
Year: 2018 PMID: 30201055 PMCID: PMC6131826 DOI: 10.1186/s13071-018-3086-z
Source DB: PubMed Journal: Parasit Vectors ISSN: 1756-3305 Impact factor: 3.876
Statistical summary of de novo assembled transcriptome sequences from male and female S. mekongi adult worms
| Parameter | Male worms | Female worms |
|---|---|---|
| Summary of raw sequencing reads | ||
| Total raw readsa | 346,331,720 | 411,906,616 |
| Total raw read nucleotides above Q30 (%) | 48,771,315,768 (93.9%) | 57,423,179,186 (92.9%) |
| Summary of trimmed sequencing reads | ||
| Total clean readsa | 304,934,770 | 363,296,510 |
| Total clean read nucleotides above Q30 (%) | 43,463,738,987 (97.4%) | 51,257,988,579 (96.9%) |
| Summary of final transcriptome assembly contigs | ||
| Number of contigs | 119,604 | |
| Smallest contig length (nt) | 201 | |
| Largest contig length (nt) | 30,465 | |
| Number of contigs with length < 200 nt | 0 | |
| Number of contigs with length > 1k nt | 42,600 | |
| Number of contigs with length > 10k nt | 625 | |
| N50 (nt) | 2107 | |
| GC content (%) | 34.1 | |
aReads of three replicates
Summary of assembled transcript annotation
| Transcript annotation | No. of transcripts |
|---|---|
| Total transcripts | 119,604 |
| Total protein sequences | 20,798 |
| Number of protein sequences with BLASTP hits | 19,744 |
| Number of transcript sequences with BLASTX hits | 48,256 |
| Number of transcript sequences with GO terms (BLAST) | 21,833 |
| Number of transcript sequences with GO terms (Pfam) | 8462 |
Fig. 1Smear plot of all DE transcripts with cpm > 1.0 from S. mekongi male and female worms. The graph shows average log2(cpm) on the x-axis vs log2(fold change in gene expression) between male and female worms on the y-axis. DE transcripts are shown in red and non-DE transcripts are shown in black
Fig. 2Heatmap of 50 most significantly DE transcripts from S. mekongi male vs female worms with three biological replicates. DE transcripts with higher expression levels are shown in red. DE transcripts with lower expression levels are shown in blue
Top 50 most significantly upregulated transcripts in S. mekongi male worms
| Transcript ID | Protein description | Log2(FC) | FDR | Pfam annotation | |
|---|---|---|---|---|---|
| comp65_seq18 | Reverse transcriptase | 16.2 | 2.66E-92 | 1.24E-88 | – |
| comp2831_seq1 | Uncharacterized protein | 15.2 | 1.47E-70 | 2.29E-67 | – |
| comp1481_seq2 | Clone ZZD1216 mRNA sequence | 14.8 | 1.19E-70 | 2.01E-67 | – |
| comp926_seq0 | Uncharacterized protein | 14.2 | 3.84E-56 | 3.01E-53 | – |
| comp2062_seq0 | Uncharacterized protein | 13.8 | 1.53E-45 | 6.63E-43 | – |
| comp4508_seq0 | Integrin | 13.7 | 2.79E-47 | 1.30E-44 | – |
| comp4099_seq6 | Uncharacterized protein | 13.6 | 3.96E-41 | 1.34E-38 | G-patch domain, FtsJ-like methyltransferase |
| comp556_seq2 | Hypothetical protein | 13.6 | 3.35E-32 | 4.84E-30 | – |
| comp4395_seq2 | Putative poly(Rc) binding protein | 13.6 | 1.45E-43 | 5.73E-41 | KH domain |
| comp298_seq2 | Uncharacterized protein | 13.5 | 2.02E-41 | 7.11E-39 | Ankyrin repeats |
| comp5387_seq1 | Serine/threonine kinase | 13.2 | 1.95E-36 | 4.39E-34 | Protein kinase domain, protein tyrosine kinase |
| comp3544_seq2 | Dynein-associated protein | 13.2 | 1.51E-34 | 2.79E-32 | – |
| comp510_seq4 | Tropomyosin-2 | 13.0 | 1.04E-22 | 6.12E-21 | Aida N-terminus, |
| comp1917_seq1 | Lysine-tRNA ligase | 12.8 | 2.43E-26 | 2.02E-24 | tRNA anti-codon, OB-fold nucleic acid binding domain |
| comp7221_seq1 | Uncharacterized protein | 12.6 | 3.04E-27 | 2.75E-25 | – |
| comp8398_seq0 | Putative homeobox protein | 12.6 | 6.23E-25 | 4.54E-23 | N-terminal of Homeobox Meis and PKNOX1 |
| comp3445_seq0 | Receptor protein-tyrosine kinase | 12.6 | 1.84E-26 | 1.55E-24 | Receptor L domain, furin-like cysteine rich region |
| comp3797_seq0 | Uncharacterized protein | 12.5 | 2.10E-26 | 1.75E-24 | BTG family |
| comp3182_seq1 | Putative multiple ankyrin repeats single KH domain protein | 12.4 | 2.95E-17 | 9.84E-16 | Ankyrin repeats |
| comp115_seq0 | Uncharacterized protein | 12.4 | 1.44E-22 | 8.37E-21 | Arrestin (or S-antigen), N-terminal domain |
| comp7837_seq2 | Uncharacterized protein | 12.2 | 8.70E-24 | 5.69E-22 | Cadherin domain |
| comp5447_seq1 | Myosin light chain kinase, smooth muscle | 12.2 | 1.17E-15 | 3.10E-14 | Immunoglobulin I-set domain |
| comp3754_seq1 | Metastasis suppressor protein 1 | 12.1 | 3.29E-19 | 1.38E-17 | IRSp53/MIM homology domain |
| comp8738_seq10 | Uncharacterized protein | 12.1 | 2.30E-21 | 1.20E-19 | – |
| comp4029_seq0 | Plexin-A1 | 12.1 | 4.14E-20 | 1.88E-18 | Plexin repeat, IPT/TIG domain |
| comp138_seq8 | SJCHGC08958 protein | 12.0 | 1.54E-13 | 3.00E-12 | Trematode eggshell synthesis protein |
| comp5826_seq0 | Uncharacterized protein | 12.0 | 2.10E-22 | 1.19E-20 | Leucine rich repeat |
| comp3044_seq3 | Uncharacterized protein | 12.0 | 1.01E-20 | 4.89E-19 | NIF3 (NGG1p interacting factor 3) |
| comp8147_seq2 | Uncharacterized protein | 11.9 | 4.28E-20 | 1.94E-18 | Laminin G domain |
| comp595_seq0 | Choline dehydrogenase, mitochondrial | 11.9 | 7.87E-15 | 1.85E-13 | GMC oxidoreductase |
| comp1402_seq1 | Uncharacterized protein | 11.9 | 1.25E-16 | 3.78E-15 | – |
| comp490_seq3 | Serine/threonine-protein kinase PAK 3 | 11.9 | 5.01E-19 | 2.07E-17 | P21-Rho-binding domain |
| comp8255_seq1 | Uncharacterized protein | 11.8 | 3.00E-20 | 1.38E-18 | – |
| comp4371_seq0 | Putative ABC transporter | 11.8 | 1.63E-19 | 7.01E-18 | ABC transporter |
| comp5811_seq8 | Rapamycin-insensitive companion of mTOR | 11.8 | 2.32E-17 | 7.85E-16 | Rapamycin-insensitive companion of mTOR, domain 5 |
| comp9143_seq1 | Regulator of G-protein signaling 3 | 11.8 | 5.29E-21 | 2.64E-19 | C2 domain, PDZ domain (DHR or GLGF) |
| comp9843_seq1 | Uncharacterized protein | 11.8 | 6.13E-20 | 2.71E-18 | Rhodopsin family |
| comp8166_seq1 | Uncharacterized protein | 11.8 | 4.38E-21 | 2.20E-19 | – |
| comp1561_seq0 | Uncharacterized protein | 11.6 | 2.16E-12 | 3.55E-11 | – |
| comp6197_seq1 | Uncharacterized protein | 11.6 | 3.52E-17 | 1.17E-15 | – |
| comp9531_seq0 | Putative ankyrin 2,3/unc44 | 11.6 | 2.07E-16 | 6.05E-15 | Ankyrin repeats |
| comp9039_seq1 | Centrosomal protein of 120 kDa | 11.6 | 2.64E-19 | 1.11E-17 | C2 domain, Cep120 protein |
| comp2906_seq3 | CD97 antigen | 11.6 | 3.96E-18 | 1.47E-16 | Secretin family |
| comp3517_seq2 | Rho guanine nucleotide exchange factor 12 | 11.6 | 1.38E-19 | 6.00E-18 | Regulator of G protein signalling-like domain |
| comp2195_seq3 | Uncharacterized protein | 11.6 | 1.91E-15 | 4.93E-14 | Galactoside-binding lectin |
| comp5487_seq0 | Uncharacterized protein C7orf63-like protein | 11.6 | 8.05E-20 | 3.52E-18 | – |
| comp4205_seq4 | Serine-rich adhesin for platelets | 11.5 | 2.14E-14 | 4.76E-13 | – |
| comp3759_seq1 | Putative Alstrom syndrome protein | 11.5 | 1.86E-16 | 5.50E-15 | – |
| comp9022_seq1 | Uncharacterized protein | 11.5 | 1.97E-18 | 7.61E-17 | MORN repeat |
| comp10754_seq0 | Uncharacterized protein | 11.4 | 8.21E-19 | 3.31E-17 | Cadherin-like |
Top 50 most significantly upregulated transcripts in S. mekongi female worms
| Transcript ID | Protein description | Log2(FC) | FDR | Pfam annotation | |
|---|---|---|---|---|---|
| comp837_seq1 | SJCHGC05677 protein | 18.5 | 1.51E-88 | 4.69E-85 | – |
| comp546_seq10 | O-glycosyltransferase | 16.5 | 6.50E-54 | 4.04E-51 | Tetratricopeptide repeat |
| comp1_seq2 | Uncharacterized protein | 16.2 | 2.08E-09 | 2.06E-08 | – |
| comp1141_seq3 | Putative scythe/bat3 DNA repair | 15.9 | 7.99E-51 | 4.51E-48 | Ubiquitin family |
| comp401_seq5 | SJCHGC09037 protein | 15.8 | 7.56E-34 | 1.29E-31 | Protein similar to CwfJC-terminus 1 |
| comp7058_seq0 | Uncharacterized protein | 15.7 | 6.12E-20 | 2.71E-18 | – |
| comp524_seq5 | Uncharacterized protein | 14.8 | 8.48E-19 | 3.41E-17 | – |
| comp5822_seq2 | DNA excision repair protein ERCC-6 DNA repair | 14.2 | 7.21E-29 | 7.68E-27 | Type III restriction enzyme, res subunit |
| comp5430_seq10 | Inositol 1,4,5-trisphosphate receptor | 14.0 | 5.20E-88 | 1.39E-84 | Inositol 1,4,5-trisphosphate/ryanodine receptor |
| comp3416_seq1 | Thioredoxin domain-containingprotein 4 | 13.9 | 1.32E-48 | 7.02E-46 | Thioredoxin-like domain |
| comp3035_seq0 | Uncharacterized protein | 13.7 | 7.86E-27 | 6.85E-25 | DnaJ domain |
| comp6618_seq1 | Uncharacterized protein | 13.6 | 3.75E-67 | 4.99E-64 | – |
| comp247_seq3 | SJCHGC09430 protein | 13.5 | 2.43E-64 | 2.66E-61 | – |
| comp3544_seq3 | Mannosyltransferase | 13.5 | 4.23E-24 | 2.84E-22 | Alg9-like mannosyltransferase family |
| comp7684_seq1 | Uncharacterized protein | 13.0 | 2.85E-26 | 2.34E-24 | – |
| comp1544_seq2 | EH domain-bindingprotein 1 | 12.9 | 5.14E-22 | 2.83E-20 | CAMSAP CH domain |
| comp4065_seq0 | Uncharacterized protein | 12.9 | 7.78E-31 | 1.00E-28 | – |
| comp3239_seq2 | Uncharacterized protein | 12.8 | 1.94E-19 | 8.30E-18 | Phosphatidylethanolamine-binding protein |
| comp167028_seq0 | Uncharacterized protein | 12.7 | 2.33E-64 | 2.66E-61 | – |
| comp7307_seq0 | Uncharacterized protein | 12.7 | 1.97E-31 | 2.68E-29 | – |
| comp6088_seq2 | Uncharacterized protein | 12.6 | 7.12E-39 | 1.92E-36 | – |
| comp4607_seq0 | Uncharacterized protein | 12.5 | 7.98E-33 | 1.23E-30 | – |
| comp8587_seq1 | Pol polyprotein | 12.4 | 2.50E-40 | 7.78E-38 | – |
| comp5229_seq0 | Uncharacterized protein | 12.3 | 4.78E-15 | 1.16E-13 | – |
| comp8246_seq0 | Uncharacterized protein | 12.3 | 4.60E-25 | 3.41E-23 | – |
| comp2149_seq0 | Uncharacterized protein | 12.2 | 4.45E-97 | 2.77E-93 | – |
| comp5198_seq1 | Uncharacterized protein | 12.2 | 1.16E-28 | 1.20E-26 | – |
| comp475_seq2 | Ccr4-not transcription complex gene regulation | 12.2 | 1.52E-37 | 3.94E-35 | – |
| comp8567_seq0 | Uncharacterized protein | 12.0 | 1.58E-31 | 2.17E-29 | – |
| comp7704_seq0 | Uncharacterized protein | 12.0 | 7.90E-28 | 7.63E-26 | – |
| comp1502_seq0 | Uncharacterized protein | 12.0 | 3.49E-20 | 1.59E-18 | – |
| comp9595_seq0 | Uncharacterized protein | 11.9 | 4.66E-30 | 5.57E-28 | – |
| comp29_seq2 | SJCHGC05410 protein | 11.9 | 1.22E-17 | 4.25E-16 | – |
| comp5733_seq0 | Elastase 2b | 11.9 | 5.02E-60 | 4.92E-57 | Trypsin-like peptidase domain |
| comp785_seq0 | Uncharacterized protein | 11.9 | 1.50E-22 | 8.68E-21 | – |
| comp11219_seq0 | Uncharacterized protein | 11.9 | 3.03E-36 | 6.57E-34 | – |
| comp9353_seq1 | Uncharacterized protein | 11.8 | 2.73E-21 | 1.40E-19 | – |
| comp969_seq7 | Uncharacterized protein | 11.8 | 6.68E-27 | 5.93E-25 | – |
| comp5698_seq0 | Uncharacterized protein | 11.8 | 1.13E-35 | 2.34E-33 | – |
| comp10698_seq1 | Uncharacterized protein | 11.7 | 6.86E-34 | 1.18E-31 | – |
| comp7614_seq1 | Uncharacterized protein | 11.7 | 2.55E-21 | 1.32E-19 | – |
| comp4729_seq2 | Uncharacterized protein | 11.6 | 1.17E-30 | 1.45E-28 | Alpha amylase, catalytic domain |
| comp10183_seq1 | Uncharacterized protein | 11.5 | 1.90E-21 | 1.01E-19 | – |
| comp5834_seq0 | SJCHGC06880 protein | 11.5 | 2.91E-33 | 4.68E-31 | RING-variant domain |
| comp268_seq0 | Putative 26S proteasome non-ATPase regulatory subunit 11 protein misfolding repair | 11.5 | 8.14E-05 | 3.44E-4 | PCI domain |
| comp9969_seq1 | Uncharacterized protein | 11.4 | 2.39E-33 | 3.87E-31 | – |
| comp1262_seq5 | Uncharacterized protein | 11.2 | 3.05E-22 | 1.70E-20 | Dynein heavy chain |
| comp1218_seq2 | Connector enhancer of kinase suppressor of Ras2 | 11.2 | 5.59E-30 | 6.52E-28 | PDZ domain (DHR or GLGF) |
| comp5896_seq0 | Uncharacterized protein | 11.2 | 5.53E-90 | 2.06E-86 | – |
| comp5214_seq0 | Uncharacterized protein | 11.1 | 2.20E-29 | 2.42E-27 | – |
Fig. 3GO analysis of DE transcripts in S. mekongi male and female worms. GO categories are organized according to three main ontologies: biological process (a), cellular component (b) and molecular function (c). The x-axis shows the numbers of transcripts. The y-axis shows the GO term. GO terms with higher FDR values are shown in light blue, whereas GO terms with lower FDR values are shown in dark blue
Fig. 4Enrichment pathway analysis of DE transcripts in S. mekongi male vs female worms (Q-value ≤ 0.05). The bar graph shows the number of annotated DE transcripts (y-axis) associated with each sub-class of an enriched pathway (x-axis)
Fig. 5Top 30 most significantly enriched pathways in S. mekongi male and female worms. The bar graph shows the number of annotated DE transcripts (x-axis) with each significantly enriched pathway (y-axis). Blue and red colors are the numbers of upregulated transcripts in male and female worms, respectively
Fig. 6Phosphatidylinositol signaling pathway. Red background indicates upregulated genes in male worms and green background indicates upregulated genes in female worms