| Literature DB >> 31390359 |
Fang Luo1, Mingbo Yin1,2, Xiaojin Mo2, Chengsong Sun1, Qunfeng Wu1, Bingkuan Zhu1, Manyu Xiang1, Jipeng Wang1, Yi Wang1, Jian Li1, Ting Zhang2, Bin Xu2, Huajun Zheng3, Zheng Feng2, Wei Hu1,2.
Abstract
BACKGROUND: Schistosoma japonicum is a parasitic flatworm that causes human schistosomiasis, which is a significant cause of morbidity in China and the Philippines. A single draft genome was available for S. japonicum, yet this assembly is very fragmented and only covers 90% of the genome, which make it difficult to be applied as a reference in functional genome analysis and genes discovery.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31390359 PMCID: PMC6685614 DOI: 10.1371/journal.pntd.0007612
Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN: 1935-2727
Genome assembly statistics for the improved genome of S. japonicum in comparison with three published Schistosoma genome.
V1 indicated conventional capillary sequenced genome and V2 indicated our improved genome.
| Genome size (bp) | 369,900,518 | 402,743,189 | 409,579,008 | 375,894,156 |
| Number of scaffolds | 1,789 | 25,048 | 320 | 29,834 |
| Number of contigs | 2,108 | 95,267 | 602 | 59,195 |
| Longest scaffold (bp) | 6,264,197 | 1,730,213 | 88,881,357 | 1,826,302 |
| Average scaffold length (bp) | 210,145 | 16,078 | 1,279,934 | 12,560 |
| Number of scaffolds: >10 kb | 1,052 | 4,707 | 318 | 2,384 |
| Number of Gaps | 319 | 70,219 | 282 | 29,361 |
| Scaffold N50 (bp) | 1,093,989 | 176,869 | 50,458,499 | 317,484 |
| Contig N50 (bp) | 871,911 | 6,121 | 5,339,380 | 22,446 |
| GC content (%) | 33.76 | 34.08 | 35.47 | 34.22 |
| Repeat content (%) | 46.87 | 44.56 | 49.23 | 42.83 |
Fig 1Genome assembly of Schistosoma japonicum.
(A) Comparison of contiguity between the two versions of S. japonicum genome assembly. N(x)% graph shows the contig and scaffold sizes (y-axis), where x% of the genome assembly consists of contigs and scaffolds of at least that size. (B) comparison between two version of S. japonicum genome assembly, showing the portions of the genomes that are complete (blue), fragmented (yellow) or missing (red), as determined by benchmarking universal single-copy orthologs (BUSCO) analysis with metazoan_odb9 database. (C) Circle plot of synteny between the second version of S. japonicum genome and S. mansoni genome V7 made using SyMAP. It shows a high degree of synteny, with many long S. japonicum scaffolds covering significant portions of S. mansoni chromosome. (D) Circle plot of synteny between the first version of S. japonicum genome and S. mansoni genome. V1 indicated conventional capillary sequenced genome and V2 indicated our improved genome.
Comparison of predicted genes of the two version of S. japonicum genome assembly.
V1 indicated conventional capillary sequenced genome and V2 indicated our improved genome.
| Gene number | 10,089 | 12,738 |
| Average gene length (bp) | 18,370 | 9,960 |
| Average CDS length (bp) | 1,537 | 1,172 |
| Average exons per gene | 8.3 | 5.3 |
| Average exon length (bp) | 370 | 223 |
| Average intron length (bp) | 2,521 | 2,058 |
| Complete | 81.8% | 67.4% |
| partial | 4.3% | 15.0% |
| Missing | 13.9% | 17.6% |
Fig 2Length distribution comparison on total gene, CDS, exon, and intron of annotated gene models of the S. japonicum with other closely related Trematoda species.
Length distribution of total genes (A), CDS (B), exon (C), and intron (D) were compared to those of S. mansoni, S. haematobium, C. sinensis, O. viverrini, and F. hepatica.
Annotation of protein-coding genes and noncoding RNA elements in the improved S. japonicum genome assembly.
| Number (%) | |
|---|---|
| SWISSPROT | 6,642 (65.8) |
| TrEMBL | 6,137 (60.8) |
| NCBI nr database | 9,291 (92.1) |
| KEGG database | 4,547 (45.1) |
| InterProScan | 8,689 (86.1) |
| Gene ontology annotation | 8,368 (82.9) |
| Small nuclear RNA (snRNA) | 54 |
| Transfer RNA (tRNA) | 1,263 |
| Micro RNA (miRNA) | 172 |
| Ribosomal RNA (rRNA) | 10 |
Fig 3Comparative genome analysis between S. japonicum and other six flatworms.
(A) Phylogenetic tree and expansion and contraction of gene families. The phylogenetic tree and divergence time were generated from 2,322 single-copy orthologous genes using BEAST2. The branch lengths of the phylogenetic tree are scaled to estimated divergence time. Tree topology is supported by posterior probability of 1.0 for all nodes. The blue bars on the nodes indicate the 95% credibility intervals of the estimated posterior distributions of the divergence times. The number of expanded (orange) and contracted (blue) gene families is designated on each branch. Bar charts indicates the orthologous and paralogous gene families in S. japonicum and other six flatworm species. (B) Comparison of the number of gene families in 7 Platyhelminthes species.
Fig 4Gene Ontology enrichment analysis of significantly expanded gene families.
(A) biological processes, (B) molecular function and (C) cellular component.